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Abstract. A law of large numbers and a central limit theorem are derived 
for linear statistics of random symmetric matrices whose on-or-above diago- 
nal entries are independent, but neither necessarily identically distributed, nor 
necessarily all of the same variance. The derivation is based on systematic 
combinatorial enumeration, study of generating functions, and concentration 
inequalities of the Poincare type. Special cases treated, with an explicit eval- 
uation of limiting variances, are generalized Wigner and Wishart matrices. 



The interest in the Hmiting properties of the empirical distribution of eigenvalues 
of large symmetric random matrices can be traced back to |Wis28| and to the path- 
breaking article of Wigner |Wig55| . We refer to |5a99| . [HeOO] . |HP00j . |Me91l and 
[PL03j for partial overview and some of the recent spectacular progress in this field. 

In this paper we study both convergence of the empirical distribution and central 
limit theorems for linear statistics of the empirical distribution of a class of random 
matrices. To give right away the flavor of our results, consider for each positive 
integer N the A^-by-A^ symmetric random matrix X{N) with on-or- above-diagonal 
entries X{N)ij = N~^/^f(i/N,j/Ny/'^^ij, where the are zero mean unit vari- 
ance i.i.d. random variables satisfying the Poincare inequality with constant c (see 
illl.71 for the definition), and /(■,■) is a nonnegative function symmetric and con- 
tinuous on [0, 1]^ such that f{x,y)dy = 1. Define the semicircle distribution as 
of zero mean and unit variance to be the measure on M of compact support with 
density ^ := ^^/^'^-^'^{\x\<2} ■ Let \i{N) < ■ ■ ■ < Xn{N) be the eigenvalues 
of X{N). Under these assumptions, a corollary of our general results (see Theorem 
13.51 below) states that the empirical distribution L{N) := N~^J2iLi^\iiN) con- 
verges weakly, in probability, to as, and further, for any continuously differentiable 
function / on R of polynomial growth, with ||/'||L2(crs) > 0, the sequence of random 
variables 



converges in distribution to a nondegenerate zero mean Gaussian random variable 
with variance given by an explicit formula. Similar explicit results hold for a class 
of generalized Wishart matrices, see Theorem 112.71 For polynomial test functions 
/, such results hold in much greater generality, see Theorem 13.31 

Our approach has two main components. The first, of some interest on its own, is 
a combinatorial enumeration scheme for the different types of terms that contribute 
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to the expectation of products of traces of powers of the matrices under study. This 
scheme takes the bulk of the paper to develop. The other component, which allows 
us to move from polynomial test functions to continuously differentiable ones, is 
based on concentration inequalities of the Poincare type. The latter component is 
developed in Hill building on earlier results of concentration for random matrices 
that can be found in |(::B04j and j(;ZOO'. 

CLT results related to Theorems 18 .31 and 13 . 51 below have already been stated in 
the literature. An especially strong inspiration to our study is the work of Jonsson 
|Jo82| . who gives CLT statements for traces of polynomial functions of Gauss- 
ian Wishart matrices, based on the method of moments introduced by Wigner in 
|Wig55| . The method of moments was revisited in the far-reaching work of Sinai and 
Soshnikov ^S9^, where, as an easy by-product of their results, they state a CLT 
for traces of analytic functions of Wigner-type matrices. Pastur and co-authors, on 
the one hand, and Bai and co-authors, on the other, have championed an approach 
based on the evaluation of resolvents. The latter approach has the advantage of al- 
lowing one to relax hypotheses on matrix entries; in particular one does not need to 
have all moments finite. CLT statements based on these techniques and expressions 
for the resulting variance, for functions of the form f{x) = "^cii/^Zi — x) where 

G C \ R, and matrices of Wigner type, can be found in |KKP96| . with somewhat 
sketchy proofs. Earlier statements can be found in iiGi90| . A complete treatment for 
/ analytic in a domain including the support of the limit of the empirical distribu- 
tion of eigenvalues is given in (BYOS, for matrices of Wigner type, and in BS04 for 
matrices of Wishart type under a certain restriction on fourth moments. Much more 
is known for restricted classes of matrices: Johansson j.Toh98| . using an approach 
based on the explicit joint density of the eigenvalues available in the independent 
case only in the Gaussian Wigner situation, characterizes completely those func- 
tions / for which a CLT holds. Cabanal-Duvillard |("D01j introduces a stochastic 
calculus approach and proves a CLT for traces of polynomials of Gaussian Wigner 
and Wishart matrices, as well as for traces of non-commutative polynomials of pairs 
of independent Gaussian Wigner matrices. Recent extensions and reinterpretation 
of his work, using the notion of second order freeness, can be found in (MS04. . Still 
in the Gaussian case, Guionnet |Gu02| . using a stochastic calculus approach, gives 
a CLT (with a somewhat implicit variance computation) for a class of functions 
/ in the case of band matrices. Earlier, laws of large numbers for band matrices 
were derived, see e.g. |MPK92) . |Sh96| and the references therein. In comparison 
with the references mentioned above, our work can be seen as relaxing the struc- 
tural assumptions on the variance of the entries of the matrix X{N), as well as 
the Gaussian assumption, while still requiring rather strong moment bounds on the 
individual entries (if one is interested only in polynomial test functions) or Poincare 
type conditions on the entries (if one wants a wider class of test functions) . 

The structure of the article is as follows. In ^ we introduce the matrix model 
considered throughout the paper and set basic notations. Our main results for 
polynomial test functions / are stated in ^ 2] develops the language we use in 
the combinatorial enumeration mentioned above, ^is devoted to some preliminary 
limit calculations which are then immediately applied in ^to prove our main result 
concerning limiting spectral measures, Theorem l3.2l SQis devoted to the derivation 
of some a priori estimates, following |FK81| . useful in the study of the support of 
the empirical distribution L(M). ^^is the heart of our enumeration scheme, and the 
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results are immediately applied in fJHlto yield the proof of our main CLT statement, 
Theorem l3.3l HlOl is devoted to the proof of Theorem l3.4l which is a technical result 
describing how to approximate E tr X(A/')" at CLT scale; this part of the paper may 
be skipped without much loss of comprehension of the remainder of the paper. Hill 
is devoted to concentration of measure results based on the Poincare inequality. 
Finally, in t ll2l we specialize our main results to generalized Wigner and Wishart 
matrices, and derive explicit representations for the resulting variances. 

2. The model 

We define in this section the class of random matrices we are going to deal 
with. Matrices of this class are symmetric, with on-or-above-diagonal entries in- 
dependent, with all entries possessing moments of all orders, and with off-diagonal 
entries of mean zero; further and crucially, subject to the constraints of symme- 
try and vanishing of off-diagonal means, the moments of entries of such matrices 
are allowed to depend upon position. Now as it turns out, only certain statistical 
properties of the patterns of first, second and fourth moments of entries figure in 
our limit formulas. Accordingly, our description of the class is contrived so as to 
emphasize those statistical properties and to suppress unneeded detail concerning 
the exact dependence of moments of entries on position. The notion crucial for 
gaining "statistical control" is that of color. The reader interested only in Wigner 
matrices should take as space of colors a space consisting of a single color. 

2.1. The band matrix modeL 

2.1.1. Colors. We fix a Polish space, elements of which we call colors. We declare 
the Borel sets of color space to be measurable. We fix a probability measure 6 on 
color space. We fix a bounded measurable real-valued function D on color space. 
For each positive integer fc we fix a bounded measurable nonnegative function d^'^^ 
on color space and a symmetric bounded measurable nonnegative function s*-''-' on 
the product of two copies of color space. We make the following assumptions: 

• S'^^ is constant for k ^ 2. 

• s^*^' is constant for k ^ {2, 4}. 

• s'*^' has discontinuity set of measure zero with respect to 9 ®9. 

• D, and the diagonal restriction of s'^^'^ have discontinuity sets of mea- 
sure zero with respect to 9. 

For any bounded function / on color space, or on a product of copies of color space, 
we write |/|co for its supremum norm. 

2.1.2. Letters. We fix a countably infinite set, elements of which we calHetters. We 
fix a function kq from letter space to color space, and we say that Ko(ck) is the color 
of the letter a. Given any nonempty finite set M of letters of cardinality N put 

which is the color distribution of letters belonging to N . We reserve the script 
letter N for use in this context and invariably denote the cardinality of M by the 
roman letter N. Analogously, given a sequence A/i, A2, A/3, . . . of finite nonempty 
sets of letters, Ni, N2, N3, . . . denotes the corresponding sequence of cardinahties. 
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(2) i?IC{.,«|'=<| 



2.1.3. The family {^e} of random variables. Wc fix a family {^e} of independent 
real- valued mean zero random variables indexed by unordered pairs e of letters. We 
assume that for all letters a, (3 and positive integers k we have 

sW(Ko(a),Ko(/3)) ifa^/3, 
dW(Ko(a)) ifa = /3, 

and moreover we assume that equality holds above whenever one of the following 
conditions holds: 

• k = 2. 

• a ^ (3 and fc = 4. 

In other words, the rule is to enforce equality whenever the not-necessarily-constant 
functions d*-^-*, s^^' or s^^^ are involved, but otherwise merely to impose a bound. 

2.1.4. Random matrices. Given any nonempty finite set M of letters, let X{M) be 
the N X N real symmetric random matrix with entries 

denote the eigenvalues of X{J\f) by \i{Af ) < ■ • • < Xn{J^), and let 

N 

be the empirical distribution of the spectrum of XiM). Put 

'L{M) = EL{N). 

Note that 

{L{M),x-) = ^EtrX{Mr, 
where here and often below we employ the abbreviated notation 

{^l,f) = j f{x)n{dx) 

for integrals. 

2.2. Generating functions. 

2.2.1. Let a be any probability measure on color space. Let 



ln=l 



be the unique sequence of real- valued bounded measurable functions on color space 
characterized by the generating function identity 

(3) - (t3^) (i-T3^/ s^^Hc,c')<,.ic',t)aidc')y 



where 

oo 

$,(c,t) = ^$„,,(c)t" 

is the corresponding generating function. We emphasize that we view the power 
series here formally, i. e., as devices for managing sequences, not as analytic func- 
tions. We write ((JJ as a shorthand for the recursion obtained by formally expanding 
both sides of Q in powers of t, and then equating coefficients of like powers of t. 
When a = 9, we omit it from the notation. 
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2.2.2. For each positive integer r we define a function 

r s('Hci,Ci) ifr-l, 

Kr{ci,...,Cr)^ I S^^\ci,C2f if r = 2, 

[ s'^^\c^,C2)s'^^\c2,c^)---s^^\cr,c^) if r > 3, 
on the product of r copies of color space. We define 

(4) e(x,zj) = ^- /••• / Krici, - ■ ■ ,Cr)Y[mc^,xMc,,y)0{dc^)) . 

We view Q(x, y) as a formal power series in x and y with real coefficients; in keeping 
with this point of view, the integrals on the right side of Q are to be evaluated by 
first expanding the integrands in powers of x and y and then integrating term by 
term (and hence, all integrals, being expectations of bounded measurable functions, 
are well defined). 

2.2.3. Put 

^'(a;, y) ^ [ (d^^' (c) - 2s^^'> (c, c))$(c, x)$(c, y)0{dc) 



X<i>{ci,x)^{c2,x)^{cuy)<^>{c2,y)e{dci)0{dc2). 

We view ^'(x, y) as a formal power series in x and y with real coefficients. As above, 
the integrals are to be evaluated by first expanding integrands in powers of x and 
y and then integrating term by term. 

2.2.4. In order to gain convenient access to the information coded in the formal 
power series Q{x,y) and '^(x,y) we introduce the following (abuse of) notation. 
We write 

(C30 C30 \ OO 

i=0 i=0 I i=0 

for any sequences [o^J^q and [fijll^o '^^ "^^^ numbers such that the sum on the right 
has only finitely many nonzero terms. Similarly we write 

(OO CXD OO OO \ OO OO 

1=0 j=0 i=0 j=0 I i=0 j=0 

for any doubly infinite sequences [ay]f^=o ['^ij]i°j=o ^^^^ numbers such that 
the sum on the right has only finitely many nonzero terms. 

The following fact, proved in ^ explains the role of the sequence ^n.a- 

Lemma 2.3. If a ~ 9 or a — 9_\f for some finite nonempty set of letters M, then 
there exists a unique probability measure /i^ on the real line such that 

(6) = (a,$„+i,„) (n = 0,l,2,...), 



(7) supp^, C[-C,q (C = 2(|i?U + |s(2)|V2)). 

In what follows, we write ^ — fig and /i^/- = /iSy^- 
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3. Assumptions and main theorem 

Throughout, we let denote the weak convergence of probabihty measures. All 
of our results will be obtained under the following basic 

Assumption 3.1. All the assumptions and notations in ^[^hold. Further, there 
exists a sequence [N'k]'kLi of finite nonempty sets of letters such that ^ oo and 
Of^, 9. 

Our main results are: 

Theorem 3.2. Let Assumvtion \3.1\ hold. Then: (i) L{Afk) ^ n in probability, 
(n) ^ M- 

Theorem 3.3. Let Assumvtion \!^.ll hold. Fix a real-valued polynomial function 
/(•) on the real line. Then the sequence of random variables 

Zj,k := tr/(X(A4)) - Str/(A(A4)) 
converges in distribution to a zero mean Gaussian random variable Z f of variance 

(8) EZ) = {2e{x,y) + ^{x,y),xr{x)yf'{y)). 
Theorem 3.4. In the setting of the preceding theorem, we also have 

(9) lim Etr f{X{Afk)) - Nk ■ im.J) ^ koit^t) + '<SJit,t),tf'{t)) Ef. 

k — ^oo Z 

We state the formulas ^ and lO in separate theorems because their proofs are 
separated in the main body of the paper. In fact, we have structured the paper 
so that the reader interested only in (jSJ and its applications can largely ignore the 
extra (and somewhat heavy) apparatus needed to prove 

The results above can be made more transparent, and their range extended, for 
certain special cases. Of particular interest is the following: 

Theorem 3.5. Let Assumvtion \3. h hold, and further assume that 

(10) DeeO, J s'^^\c,c')e{dc') = l. 

Then: (i) /i is the semicircle law as of zero mean and unit variance, (ii) For 
polynomial functions f the random variables Zf^k converge in distribution toward 
a mean zero Gaussian random variable Z f with variance given by \6'1\j . (Hi) If the 
random variables C{a,/3} satisfy a Poincare inequality with common constant c 
(see for definitions), then statement (ii) extends to continuously differentiable 
functions f with polynomial growth, with variance again given by j6'?| ) . 

We refer to the situation in Theorem 13.51 above as the generalized Wigner matrix 
model, because when s^^^ = 1 one recovers Wigner matrices. The expression Ef in 
© can also be computed in this case, see below. We note in passing that for 
Gaussian matrices, the condition H10|l has been identified in |NSS02[ Corollary 3.4] 
as sufficient and necessary (if D = 0) for /j, to equal the semicircle distribution. 

Similar considerations apply to the generalized Wishart matrix model, see t|12.6l 
for details. 
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4. Basic spelling, grammar and counting 

We introduce in this section the basic language employed throughout the paper 
for discussing enumeration problems. From letters we build words, from words we 
build sentences, and then we distinguish certain classes of words and sentences 
in terms of properties of naturally associated graphs. The classes of words and 
sentences singled out here for special attcintion arc eventually going to be used to 
enumerate the terms in sums giving the (mixed and/or centered) moments of traces 
of powers of our random matrices. In particular, the Wigner words enumerate the 
only terms whose contributions to the law of large numbers for linear statistics do 
not vanish in the limit, whereas the CLT word-pairs take care of the only terms 
whose contributions to the CLT variance do not vanish in the limit. Further, 
the CLT sentences (which can be built up systematically from the CLT word-pairs) 
enumerate the nonnegligible terms in sums giving mixed centered moments of traces 
of powers of our random matrices. Critical weak Wigner words, and marked Wigner 
words, are needed (only) in the evaluation of the mean shift of linear statistics. 

4.1. Words and sentences. A word is a finite sequence of letters at least one 

letter long. (Words are never empty!) Wc denote the length of a word w by £{w). 
We say that a word w is closed if the first and last letters of w are the same. (Every 
one-letter word is automatically closed.) We view letters as one-letter words. A 
sentence is a finite sequence of words at least one word long. (Sentences are never 
empty, nor do they contain empty words!) We view words as one-word sentences. 
The support suppa of a sentence a is the set of letters appearing in a, and the 
combinatorial weight wt a is the cardinality of supp a. We say that sentences a and 
b are disjoint if suppo n supp 6 — 0. Wc say that sentences a and b are equivalent 
and write a ~ 6 if there exists a one-to-one letter-valued function tp defined on 
supp a such that the result of applying tp letter by letter to a is b. In other words, 
a ~ 6 whenever a codes to b under a simple substitution cipher. 

We warn the reader that we distinguish between a sentence a and the word w 
obtained by concatenating all words in a; the "punctuation" carries information 
important for our purposes and therefore must not be ignored. For example, tak- 
ing the set {1,2,3} temporarily as our alphabet, the word 123123, the two-word 
sentence [123, 123] and the three-word sentence [1,231,23] are distinct objects ac- 
cording to our point of view. 

4.2. Graphs. We fix terminology concerning graphs in a slightly restrictive but 
convenient way as follows. A graph G = {V, E) is an ordered pair consisting of a 
finite nonempty set V of letters and a set E (possibly empty) . where each clement 
of E is an unordered pair of elements of V , i. e., a subset of V of cardinality 1 or 
2. Elements of V are called vertices of G, elements of E are called edges of G, and 
edges of cardinality 1 arc said to be degenerate. Wc say that a word w = ai ■ ■ ■ Un 
of n letters is a walk on G provided that ai £V ioi i = 1, . . . ,n, and {ai, ai+i} S E 
for i = 1, . . . ,n — 1, in which case we say that each of the vertices a, and edges 
{ui, Oi+i} of G is visited by w. A geodesic in G is a walk visiting no vertex more 
than once. We say that G is connected if any two vertices are joined by a walk. If 
G is connected then ^E > #y — 1. We call G a tree if G is connected and G has 
no nontrivial loops (in particular, G has no degenerate edges). Every two vertices 
of a tree are joined by a unique geodesic. For G to be a tree it is necessary and 
sufficient that G be connected and i^E < #V — 1. A graph G' = {V',E') where 
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V C V and E' C E is called a subgraph of G. A connected component of G is a 
connected subgraph of G maximal in the family of connected subgraphs of G. We 
call G a forest if every connected component of G is a tree. A spanning forest in G 
is a graph G' = (V, E') with V' — V and E' c E such that G' is a forest having 
the same number of connected components as does G. Every graph contains at 
least one spanning forest. 

4.3. Orthographic cind grammatical notions. 

4.3.1. The graph associated to a sentence. Given a sentence 



consisting of n words (following a pattern wc use often in the sequel, denotes the 
word of the sentence, and aij denotes the j*'' letter of the i*'' word) we define 



Wc view each word Wi of the sentence a in the natural way as a walk on Ga- Wc 
emphasize that Ea = ^ ii a consists of one-letter words. Note also the difference 
between the graph associated to the sentence a and the graph associated to the 
single word consisting of the concatenation of the words of a; in general the former 

has fewer edges than the latter. 

4.3.2. Weak Wigner words. A word w is called a weak Wigner word under the 
following two conditions: 

• w is closed. 

• w visits every edge of Gw at least twice. 

Suppose now that w is a weak Wigner word. If wtw = {^{w) + l)/2, then we 
drop the modifier "weak" and call w a Wigner word. (Every single letter word is 
automatically a Wigner word.) If wtw = {(\w) — l)/2, then wo call w; a critical 
weak Wigner word. For example, spelling with the alphabet {1,2,3}, we have that 
w = 121 is a Wigner word and that w = 12121 is a critical weak Wigner word. 

4.3.3. Weak CLT sentences. Let a = [wj]"^]^ be a sentence consisting of n words. 
We say that a is a weak CLT sentence under the following three conditions: 

• All the words Wi are closed. 

• Jointly the words/ walks Wi visit each edge of Ga at least two times. 

• For each i e {1, . . . , n} there exists j G {1, . . . ,n}\{i} such that the graphs 
Gwi and G^^ (both of which are subgraphs of Ga) have an edge in common. 

Suppose now that a is a weak CLT sentence. If wt a = 5^"^]^ ^ then we drop 

the modifier "weak" and call a a CLT sentence. If n = 2 and a is a CLT sentence, 
then wc call a a CLT word-pair. For example, again spelling with the alphabet 
{1, 2, 3}, we have that a = [1231, 1321] is a CLT word-pair. 

4.3.4. Marked Wigner words. A marked Wigner word is a three-word sentence 
[w, a, P] where w is a Wigner word, and a and /3 are distinct letters appearing 
in w. 



1=1 



Ga = {Va,Ea) 



to be the graph with 



Va = supp a, Ea 
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4.3.5. Cyclic permutations. Given a word w — [aijf^i of length n and a permuta- 
tion cr of {1, . . . , 7i}, we define w'^ to be the word [aCT(i)]"=i- If cr is a power of the 
cycle (123 ■ ■ ■n), then we call a a cyclic permutation, and we say that w'^ is a cyclic 
permutation of w. 

Lemma 4.4 ("The parity principle"). Let G be a forest. Let e be an edge of G. 
Let w be a word admitting interpretation as a walk on G. Let be the unique 
geodesic in G with initial and terminal vertices coinciding with those ofw. Then 
the word/walk w visits the edge e an odd number of times if and only if the geodesic 
visits e. 

This simple principle is repeatedly applied in the sequel. The proof is elementary 
and therefore omitted. 

Proposition 4.5. Let w be a weak Wigner word, (i) We have wtw < ^'-^^'^^ with 
equality if and only if G^ is a tree, (ii) If wtw = ^^'^^'^^ then w visits every edge 
of the tree Gw exactly twice. (Hi) w is a Wigner word if and only if there exists 
a decomposition w = awi ■ ■ ■ awrU where a is the first letter of w and wi, . . . ,Wr 
are pairwise disjoint Wigner words in which a does not occur, (iv) The inequality 
— - < wtw < ^^"!] is impossible. 

These are ideas coming up in some proofs of Wigner's semicircle law by the method 
of moments. 

Proof. Put G = {V, E) = Gw = {Vw, E^)- (i) The existence of the walk w makes it 
clear that G is connected. We have 

(11) wt^-i<#£;<^^^^, 

on the left because G is connected, and on the right by the hypothesis that w is 
a weak Wigner word. The result follows, (ii) Clear. (iii)(<J=) Trivial. (iii)(=>) By 
(i) and (ii) already proved, the parts of the tree G explored by the walk w between 
successive visits to the vertex a have to be disjoint, (iv) Suppose rather that the 
inequality in question holds. Then i{w) is even and =/fV = hence by Hll|l 

we have = — 1 = — 1, and hence G is a tree. We now arrive at a 
contradiction: by the parity principle w cannot be both closed and a walk that 
takes an odd number of steps. □ 
As a consequence of Proposition 14.51 one can visualize equivalence classes of 
Wigner words as rooted planar trees, with the Wigner word determining an explo- 
ration path on the tree that visits each vertex at least once and goes over each edge 
exactly twice, c.f. Figure 1. We do not make explicit use of this correspondence but 
it does drive much of our intuition. 

4.6. Cross-sections. We say that a set of sentences A is a cross-section of a set of 
sentences S it A C S and for each sentence s & S there exists exactly one sentence 
in A equivalent to s. All the cross-sections of S arise by a process of selecting 
exactly one element from each '--^-equivalence class in S. 

4.7. Enumeration of Wigner words by Wigner words. Fix a letter a. For 
each positive integer i choose a cross-section Wi of the set of Wigner words so as 
to achieve the following conditions: 

• For all i, the letter a appears in no word belonging to Wi. 
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Figure 1 . The rooted planar tree (solid) and exploration process 
(dotted) corresponding to the equivalence class of the Wigner word 
w — 123242151. Note the decomposition w = lwilu'2l with wi — 
23242 and W2 = 5. 

• For all distinct i and j, every word belonging to Wt is disjoint from every 
word belonging to Wj . 
(This is always possible to achieve because the set of letters is countably infinite.) 
Let if be any real-valued function of words such that f{w) depends only on the 
equivalence class of w and vanishes for £{w) ^ 0, in which case the support of (p 
consists of only finitely many equivalence classes of words. By Proposition 14. Sf iii'l 
we have an enumeration formula 

oo 

(12) ^VM^YI X! X! Vi^wi ■ ■ ■ awra) 

where w ranges over any cross-section of the set of Wigner words. Formula H12|) 
leads to many useful recursions. For example, it implies that the number of equiv- 
alence classes of Wigner words of length 2n -I- 1 is the n*'' Catalan number ^jij (^^). 
The latter fact is anyhow clear from the rooted planar tree interpretation of equiv- 
alence classes of Wigner words. 

Proposition 4.8. Let w be a critical weak Wigner word. Put 

G={V,E) = G^ = {V^,E^). 

The following hold: 

(1) G is connected. 

(2) Either #V - 1 = #E or jfV = #E. 

(3) // #y - 1 = #E, then: 

(a) G is a tree. 

(b) With exactly one exception w visits each edge of G exactly twice. 

(c) But w visits the exceptional edge exactly four times. 

(4) If4j.V = ^E, then: 

(a) G is not a tree. 

(b) w visits each edge of G exactly twice. 

We state these facts for the sake of convenient reference. We omit the easy proofs. 

Proposition 4.9. Let a = [wijf^j^ be a weak CLT sentence consisting of n words. 

(i) We have wt a < Y^^=i ^^""2 ~^ ■ T**/' Suppose now that equality holds, i. e., that 
a is a CLT sentence. Then the words Wi of the sentence a are perfectly matched 
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in the sense that for all i there exists unique j distinct from i such that Wi and Wj 
have a letter in common. In particular, n is even. 

This assertion (without proof) was made in |Jo82| . 

Proof. Lemma [4.101 below is the essential point of the proof. □ 

Lemma 4.10. Let a = [w^Jf^i be a weak CLT sentence consisting of n words. Put 
G — Ga- Let k be the number of connected components of G. Then (i) k < [^J 

and (a) wta < k — n + 2^^"'''^ j , where [x\ denotes the greatest integer less 

than or equal to x. 

Proof. Inequality (i) is trivial: by hypothesis every word Wi of the sentence a is 
"mated" with at least one other word Wj {j ^ i) of the sentence in the sense that 
the connected subgraphs Gw- and Gwj share an edge and a fortiori share a vertex. 

Harder work is required to prove inequality (ii). Put a = [[o^yl^iTi ■*]"=! ? ^ — 
Ur=i{*} ^ {1, • • ■ 7 ^(■fi'i) ^ 1} and A = [{ofij, Q;i.j+i}](ij)g/. We visualize ^ as a 
left-justified table of n rows. Let G" — {V , E') be any spanning forest in G. Since 
every connected component of G' is a tree, we have wt a = fc + and so in order 
to prove (ii), we just have to bound Now let X — {-^y"}(ij)G/ be a table of 

the same "shape" as A, but with all entries equal either to or 1. We call X an 
edge-bounding table under the following conditions: 

• For all («, j) G /, if X^j = 1, then A^j G E' . 

• For each e £ E' there exist distinct (ii, ji), («2, ^2) G / such that Xi^j^ = 
^12^2 and A.^-^j-^ A^^j^ e. 

• For each e G E' and index i G {1, . . . , n}, if e appears in the i*^ row of A 
then there exists {i,j) G / such that Aij = e and Xy = 1. 

For any edge-bounding table X the corresponding quantity 5 j)e/ bounds 
^E' , whence the terminology. At least one edge-bounding table exists, namely 
the table with a 1 in position {i,j) for each {i,j) G / such that Aij G E' and O's 
elsewhere. Now let X be an edge-bounding table such that for some index io all 
the entries of X in the ig'* row are equal to 1. Then the closed word Wi^ is a walk in 
G' , and hence by the parity principle every entry in the iQ^ row of A appears there 
an even number of times and a fortiori at least twice. Now choose {io, ja) G / such 
that Aigjg G E' appears in more than one row of A. Let Y be the table obtained by 
replacing the entry 1 of X in position {io, jo) by the entry 0. Then it is not difhcult 
to check that Y is again an edge-bounding table. Proceeding in this way we can 
find an edge-bounding table with appearing at least once in every row, and hence 
we have < [^^-y^J: which is exactly what we need to prove (ii). □ 

4.11. Enumeration of CLT sentences by CLT word-pairs. Fix an even pos- 
itive integer n and for i ~ 1, . . . ,n/2 choose a cross-section Pi of the set of CLT 
word-pairs so as to achieve the following condition: 

• For all distinct i and j, every word-pair belonging to Pi is disjoint from 
every word-pair belonging to Pj . 

We declare a permutation ct of {1, . . . , n} to be a perfect matching if it satisfies the 
following conditions: 

• a(2i - 1) < cr{2i) for i = 1, . . . , n/2. 

• cr(2i - 1) < cr{2i + 1) for i = 1, . . . , n/2 - 1. 
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Now let if be any real-valued function of n-word-long sentences a — such that 

f{a) depends only on the equivalence class of a and vanishes for X]"=i ^(^i) ^ 0, 
in which case the support of Lp consists of only finitely many equivalence classes of 
sentences. By Proposition 14.91 we have an enumeration formula 

(13) ^v'(a)^ ■■■ E E <^([p.-Mdr=i) 

a [pi.P2]ePi [p»i-i,Pti]6-P„/2 o'G'S^ 

a: perfect matching 

where a ranges over any cross-section of the set of n-word-long CLT sentences. 
Proposition 4.12. Let a — [w^x] he a CLT word-pair and put 

G={V,E)=Ga = {Va,Ea). 

For each e ^ E let v{e,w) (resp., v{e,x)) denote the number of visits to e by the 
word/walk w (resp., x). The following hold: 

(1) G is connected. 

(2) Either #F - 1 = #i; or #y = 

(3) - 1 = #E, then: 

(a) G is a tree. 

(b) For all e € E both v{e,w) and v{e,x) are even. 

(c) For unique ^ E we have J^(eo, w) — v{eo, x) = 2. 

(d) For all e € E \ {eo} we have v{e, w) -\- v{e^ x) = 2. 

(e) Both w and x are Wigner words. 

(4) If#V = #E, then: 

(a) G is not a tree. 

(b) For all e £ E we have v{e, w) + v{e, x) — 2. 

(c) For some e £ E we have v{e, w) — v{e, x) — 1. 

We state these facts for the sake of convenient reference. We omit the easy proofs. 



5. Limit calculations 

We work out limits of and estimates for moments needed as "raw material" for 
the proofs of Theorems 13.21 and 13.31 Assumption 13.11 remains in force throughout 
these calculations. 

5.1. Random variables indexed by sentences. Fix a sentence 



a = w,:L-_i = a. 



I^(t0i)l 



i\i=l — \V^i3\j=l \i=l 

consisting of n words. We attach several random variables to a, as follows. 
5.1.1. We define 

n l(wi)-l 

t=l j=l 

From the independence of the family {^e} and assumption |(2Jl concerning the ab- 
solute moments of these random variables, we deduce that 



(14) E\aa)\ - n ^i^^r^'^ ^ n { 



e: edge of Ga e={a,l3}, 
edge of Ga 



sM^))(Ko(a),'«o(/3)) ifa^/3, 
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(15) v{e) 



where 

total number of visits made to e 
by all the words/walks Wi 

(While v{e) depends on the sentence a, to avoid unnecessary clutter we omit this 
dependence from the notation) . Further and crucially, since all the random variables 
of the family {^e} are of mean zero, if it; is a closed word, then E^{w) = unless 
w is a weak Wigner word. 

5.1.2. We define 

n 
i=l 

Expanding the product on the right in evident fashion we find that 

(16) e(a)= (-1)*' n a^,)-Y[^am). 

/C{l,...,n} jG{l,...,n}\/ iel 

Clearly ^^[^(a)! is bounded by a constant depending only on X^ILi ^(''^«)- Further 
and crucially, if all the words Wi are closed, then we have E^{a) = unless a is a 
weak CLT sentence. 

5.1.3. Auxiliary color-valued random variables. We fix a letter-indexed i.i.d. family 
{^(a)} of color-valued random variables with common distribution 6. These ran- 
dom variables are going to be used only for bookkeeping purposes. They need not 
be defined on the same probability space as the random variables ^e- 

5.1.4. Put 

r if:y(e) = l, 

M(a) = Y[ I s('^('=»(K(a),K(/3)) if zy(e) > 1 and a ^ /?, 

e={a,/3}, [ d(''(^))(K(a)) if i^lc) > 1 and a = f3, 

edge of Ga 

where ^{e) is as in H15I) . 
5 15 Put 

M(a) = i^^)*'Mia/I) • n^^(^0, 

/C{l,...,n} iel 

where for / ^ {1, . . . , n} we denote hy a/I the sentence obtained by striking the z*'' 
word of a for all i ^ I, and for / = {1, . . . , n} we agree to put M{a/I) ~ 1. Note 
the analogy with expansion IjKil) . Note also that M{a), M{a) are random variables. 

5.1.6. For each n-tuple p = [pi]"=i of nonnegative integers put 

7T i — 1 j — 1 

where tt = [[""iil^^i ranges over families of nonnegative integers subject to the 

constraints that Yl^j'^i'^ ^ij ~ Pi for i = 1, ■ ■ ■ ,n. Note that Hp{a) — Y[7=i ^pA'^i)- 
It is convenient to set Hp{a) = for every 7i-tuple p of integers such that pi < 
for some i. We write 

MHp{a) = M{a)Hp{a), MHp{a) = M{a)Hp{a) 

in order to abbreviate notation. 
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5.1.7. Note that MHp{a) (resp., MHp(a)) remains unchanged if for some permu- 
tation a of {1, . . . , n} we replace a by the sentence [u'o.(i)]"=i and p by the n-tuple 
[Pct(i)]"=i- Note further that if the sentence a can be presented as the concatenation 
of pairwise disjoint sentences bi, . . . ,bk where bi is rii words long, and correspond- 
ingly we present the n-tuple p as the concatenation of tuples qi, . . . ,qk where qi is 

an nrtuple, then MHp{a) = nLi^^^9.(^») (^sp., MHp{a} = JlLi ^^9. (^0), 
and moreover the factors on the right are independent. 

5.2. Admissibility. Let a — be a sentence consisting of n words. For each 

edge e of the graph Ga, let i/{e) be the total number of visits to e by the words/walks 
Wi- We say that a is weakly admissible if for all edges e of Ga the following hold: 

. Pie) e {1,2,4}. 

• If i/{e) = 4, then e is nondegenerate. 

We say that a is admissible if for every nonempty subset {ii < • • • < ig} C {1, . . . ,n} 
the subsentence [wi^Yi^^i is weakly admissible. For words weak admissibility and 
admissibility are the same thing. By Proposition 14.51 every Wigner word is ad- 
missible. By Proposition 14.81 every critical weak Wigner word is admissible. By 
Propositions 14 . 91 and 14 . 1 2l every CLT sentence is admissible. 

Proposition 5.3. Let a ~ [wil^Li be a weakly admissible sentence consisting of 
n words. Let p — [piY^^i be an n-tuple of nonnegative integers. Let 71, . . . , 7r be 
distinct letters such that suppa C {71, . . . ,7r}- Then there exists a function f on 
the product of r copies of color space with the following properties: 

m f is bounded and measurable. 

• / has discontinuity set of measure zero with respect to 9®^ . 
. /(«;(7i),...,«(7r)) = MHp(a). 

• For all distinct letters 5i, . . . ,5r, the equivalent word b = to 
which a codes by the rule 7^ i— > Si for i = I, . . . ,r satisfies the equation 

fi^oiSi), KoiSr)) - Eab) E n n D{Ko{l3.i)r^ 

7r i—1 j—1 

where tt — [[Trij]^^^'']"^]^ ranges over families of nonnegative integers subject 
to the constraints X^j^i ""ij — Pi fof * — I, ■ . ■ ,n. 

Proof. For simplicity we discuss only the case p — and leave the remaining details 
to the reader. Put G — (V, E) — Ga = [Va, Ea) and as above, for all e G E, let i^(e) 
be the total number of visits to e made by the words/ walks Wi. Put 

r if j/(e) = 1, 

/(ci, . . . ,Cr) = Yl \ ■s^'^^^'-'Hs-H")' "^7-^/3)) if J^(e) > 1 and a 7^ 

e={aj3}eE { d^^f*")^ (c.^- 1 („) ) if j/(e) > 1 and a = /3, 

where is the inverse of the bijection (i 1— > 7^) : {l,...,r} — > {71, . . . , 7,.}. 
Clearly / has the first three of the desired properties. If i/(e) = 1 for some e G E, 
then the fourth property holds trivially (both sides of the desired equation vanish 
identically). Otherwise, if i/(e) > 1 for all e G E, then / has the fourth property 
because, under the hypothesis of weak admissibility, we are operating in the regime 
in which we enforce equality in the moment bound Q . □ 
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5.4. Limiting behavior of {L{Af),x"'). Fix a nonempty finite set A/" of letters 
and a positive integer n. 

5.4.1. We have an expansion 

trxijvr - E E E N^'^*'"'^a^/J) n Di^oim 

w J X jSiJ 

where: 

• w = [ail^lti ranges over a cross-section of the set of closed words of length 
n + l; 

• J ranges over subsets of the set {j G {1, . . . , JT-Haj = Q^j+i}; 

• X = ranges over words such that x ^ w and suppa; C M; and 

• x/J denotes the word obtained by striking the j*'* letter of a; for each j e J. 

Note that x/J arises from x by selective suppression of repeated letters. Note also 
that E£^{x/ J) = unless x/J is a weak Wigner word. By considering how we may 
insert repetitions of letters into a given weak Wigner word, and after some further 
algebraic manipulation, we obtain an expansion 

£(w) 

(r(AA),x") = E^"'"'"'^EE^"™*"^^(^)n^(^o(/^^))"' 

W X TT i — 1 

(17) =: E^'^'"'"^^'^(-^'^) 

w 

where: 

• w ranges over a cross-section of the set of weak Wigner words of length 

• x = [Pifi/Z/'i ranges over words such that x ^ w and suppx C N\ 

mi: — [TTi]^^-* ranges over £(it;)-tuples of nonnegative integers summing to 
n + l — £{w); and 

• S{Af, w) is the result of carrying out the inner summations on x and tt. 

Note that for n fixed, as iV oo and 9j\f ^ 6, only the part of the sum indexed 
by Wigner words w contributes nonnegiigibly. 

5.4.2. In this paragraph fix attention on a Wigner word w such that £(w) < n + l. 
We want to understand the subsum S{Af, w) appearing in formula H17() as a func- 
tion of A/". Let 7i, . . . , 7r be an enumeration of suppw. Since Wigner words are 
admissible, Proposition lS . 31 provides us with a function / defined on the product of 
r copies of color space with the following properties: 

• / is bounded and measurable. 

• / has discontinuity set of measure zero with respect to 9®^ . 



• MH„+i_^^){w) = / (k(7i), . . . , K(7r))- 
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S{N,w)=N-''' J2 /(«o(/3i),...,«;o(/3.)). 



(/3i,...,/3,)eA/''- 

/3i , . . . , /3r : distinct 

Now let [A/fc]^]^ be as in Assumption 13. II We clearly have 

\im^ S{J\fk,w) = J ■ ■ ■y"/('^i' ■ ■ ■,Cr)0{dci) ■ ■ ■ 9{dcr) = £^MiJ„+i_^(„)(w). 

We remark that it is here we make use of the hypothesis that color space is Polish: 
we need it to guarantee weak convergence Off^ ^ 0®^ . 

5.4.3. We may now conclude that 

(18) lim (L(A4), x^) = EMH,,+,_fMiw) 

W 

where the sum on the right is extended over a cross-section of the set of Wigner 
words. Note that only finitely many terms on the right are nonvanishing because 
p<0^ Hp = 0. 

Lemma 5.5. With C as in L{Afk) converges weakly to a limit /i supported in 
the interval [— C, C], and moreover (L(A/'fc),x") {fJ'jX"') for all integers n > 0. 

Proof. It is enough to prove that the right side of ifTHl) is 0(C"). There are (^^"7^) 
n-tuples of nonnegative integers summing to p. Consequently we have 

for all Wigner words w and nonnegative integers p. There are j^-^ (2/) equivalence 
classes of Wigner words of length 2£ + 1 and clearly there are no Wigner words of 
even length. Consequently there are 0(2") equivalence classes of Wigner words of 
length < n + 1. The desired 0(C") bound for the right side of H18|) follows. □ 

5.6. Limiting behavior oi E l\"^^{tr X {MY' ~ EivX{NY^). Again fix a finite 
non-empty set J\f of letters and a positive integer n. Also fix positive integers 
i/i, . . . , j/„ and put V = [i^i] -Li. 

5.6.1. We have an expansion 

n 

Witr X{AfY^ ~ Etr X{UY') 

i=l 



a K b i=l y j£Ki 

where: 



• a = [wi]^^i = [[aij]jL'i ranges over a cross-section of the set of sen- 
tences n words long with i*'* word of length Vi + 1 for i = 1, . . . , 7i; 

• K = [KiYi^i ranges over n-tuples of sets of positive integers such that Ki 
is a subset of {j G {1, . . . , Vi\\aij = aij+i\ for i = 1, . . . , n; 

• b = [xi]^^-^ = WijYjl^iYi^i ranges over sentences b ^ a such that supp6 C 
A/"; and 
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• Xi/Ki denotes the word obtained by striking the fc*'' letter of Xi for all 
k e Ki. 

After some further algebraic manipulation, we obtain an expansion 

n 

EW{trX{AfY^ -EtrX{MD 

(19) 

a 6 TT 1=1 j = l 

where: 

• a = [wil'^^i ranges over a cross-section of the set of weak CLT sentences n 
words long with i*'* word of length < ly'i + 1 for i = 1, . . . , n; 

• b — [[l3ij]f^i^]2^i ranges over sentences b ^ a such that supp6 C Af; and 

9 TT — [['I'ylj^i ■*]"=! ranges over families of nonnegative integers subject to 
the constraints X^^^Ti ""y = t^i + 1 ^ for i = I, . . . ,n. 

Note that for fixed i/, as TV ^ oo and ^ 9, only the part of the sum indexed by 
CLT sentences a contributes nonnegligibly. 

5.6.2. Now let [A/felfeLi be as in Assumption l3.1l Since CLT words are admissible, 
an analysis similar to that undertaken in i|5.4.2l leads to the conclusion that 

n 

(20) lim^El[{trX{Ukr -EirX{NkY^) = ^ ^Mi?[,,+i_,(,„^)].^^ (a) 

i—l a 

where a = [wi\l^i ranges over a cross-section of the set of CLT sentences n words 
long. Since the analysis is straightforward, somewhat long, and very tedious, we 
omit it. Note that only finitely many nonzero terms appear in the sum on the right. 

Lemma 5.7. There exists a family of mean zero random variables defined 

on a common probability space with Gaussian joint distribution such that for all 
positive integers n and positive integers t^i , . . . , the right side of limit formula 
^201) gives the expectation i?n"=i ^i^i- 

Proof Let A{iyi, ...,!/„) denote the right side of The matrix [[A{i, j)],^ J°^i is 
symmetric and every finite block [[^(i, j)][=i]j^i in the upper left corner is positive 
semidefinite since it is the limit of such matrices. Consequently there exists a 
family [Kn]^]^ of mean zero random variables on a common probability space with 
Gaussian joint distribution such that EYiYj = for all i and j. By the 

enumeration formula and the relations discussed in H5.1.7I we have 



A{vi,...,Vn) = < 



n/2 

X! n ^('^'t(2j-i) , >^a{2i}) if n is even, 

cr: perfect matching 

if n is odd. 



But the expression on the right side is the Wick formula for the expectation 
E nr=i Y<^. , cf. Theorem L28]. □ 
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6. Proofs of Lemma lOl and Theorem 13.21 
Lemma 6.1. Fix K > inax(l. C^) with C as in Q). Then we have 

lim (L(A4),|5|1|.|>K> =0 

k — >oo 

for every real-valued measurable function g on the real line with polynomial growth 
at infinity. 

Proof. Let n be any nonnegative integer. We have 

by Cauchy-Schwartz followed by Chebyshev, and hence 

(21) limsup(I(AA,0,|xri|,|>^) < limsup(I(A4),2:2")/if". 

k — *oo k — *oo 

Because K > \, the quantity on the left side of (|21|) is an increasing function of 
n, and moreover that quantity bounds limsupj,^oQ(-L(A/'fe), |ff|l|a;|>if) for all n ^ 
(because g is of polynomial growth). But by Lemma 15.51 because K > C^, the 
quantity on the right side of (|21|l tends to as n — > oo. The result follows. □ 

6.2. The functions $(""'^^(0). To each Wigner word w and nonnegative integer 
p we associate a real- valued bounded measurable function $(""'^'(0) on color space 
by the following recursive procedure. As in Proposition 14.51 in the unique way 
possible, write w = awi ■ ■ ■ aWrOi where a is the first letter of w and the Wi are 
pairwise disjoint Wigner words in which a does not appear, and then put 

(22) $("'^P)(c) = ^Z?(c)""+-+""f[ / s(2)(c,c')$''"""'+'-nc')^('^c') 

where tt = [TT^JflQ ranges over (2r + l)-tuples of nonnegative integers summing 
to p. By convention, if w is the single letter word a, then r = and therefore 
(j)(a,p) _ £)(c)P, which gives a way to initialize the recursions ((221 • Note that for 
fixed p and c the quantity $'^™'P)(c) depends only on the equivalence class of w. 
Intuitively, $^™'''^(c) determines the dominant contribution to the expectation of 
trX(A/')^('")+P by those terms that use entries from D p times, such that when 
these are discarded, the resulting word determined by the indices is equivalent to 
w, and such that the color of the initial letter is c. For example, in the special case 
that D{-) = 0, one must have p — 0, hence all tt^ vanish, and the contribution, for 
a given w, can be visualized by writing on each edge (^1,^2) of the rooted planar 
tree the value (s^^^)^/^(k(wi), k(w2)), collecting the product of such values along 
the exploration path determined by the word w, and averaging over the choices of 
colors except for the choice of the color of the root, which is fixed at c. 

Lemma 6.3. We have the following identity of formal power series in t with co- 
efficients in the space of real-valued bounded measurable functions on color space: 

00 

(23) $(c,t) = ^^$(«'^P)(c)t^('")+f 

w p—0 

Here w ranges over a cross-section of the set of Wigner words. 
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Proof. Via the enumeration formula H12|l it follows from definition 1)22(1 that the 
power series on the right side of H23|l satisfies Q, whence the result. □ 

Lemma 6.4. Let w be a Wigner word. Let a be the first letter of w. Let p be a 

nonnegative integer. Then we have 

(24) E{MHp{w)\K{a)) =^'''"'P\K{a)), a.s.. 

Proof. As in definition 1(22(1 . write w — awi ■ ■ ■ aWrU where the Wi are pairwise 
disjoint Wigner words in which a does not occur and let ai denote the first letter 
of Wi. By definition of M(-) and Hp(-) we have 

r 

(25) Mi/p(ii;) =^i?(K(a))"°+-+"-[|s(2)(K(a),K(a,))Mi?..+.(u;^) 

TT i=l 

where tt — [iTilfLQ ranges over (2r + l)-tuplcs of nonnegative integers summing to 
p. Now take conditional expectations on both sides of l(25() . By induction on £{w), 
and the relations of independence built into the definitions of M[-) and H.(-), we 
get ((2411 after a routine calculation. □ 

6.5. Ends of the proofs. 

6.5. f. Proof of Lemma \2.!A Uniqueness of a probability measure with moments © 
and support (O (which is compact) is clear. Only existence requires proof. After 
enlarging the originally given model in evident fashion we may assume without loss 
of generality that for every letter there exist infinitely many letters of the same 
color. And then we may assume without loss of generality that a = 9 because 
in Assumption 13. ll we may substitute for 6 without falsifying it. Now fix any 
sequence [A4]fe^i as in Assumption 13. II Let ^ be the weak limit of L(A4) provided 
by Lemma 15.51 By the cited lemma, satisfies the support bound 0. Moreover, 
by the cited lemma combined with limit formula 1(18(1 . the measure /i has moments 



(26) 



where w ranges over a cross-section of the set of Wigner words. By Lemmas 16.31 
and 16. 41 we can evaluate the right side of 1(26(1 . We find finally that moment formula 
(jHl does indeed hold for /i. □ 

6.5.2. Proof of Theorem 15'. M Fix any real-valued bounded continuous function / 
on the real line and e > 0. For the convergence L{Afk) /i it is enough to show 
that 



(27) 



lim P(|(i(AAfe),/)-(M,/)| >e)-0. 



Fix K as in Lemma l6.ll By the Weierstrass approximation theorem write 

f^g + Q, sup |5(a;)| < e/4 

\x\<K 

where Q is a polynomial function. We have 
(LiMk), f) - (a*, /) = \{L{Nk), 1|.|<A'.9) - (m, 1|x|<a'.9)1 + (^(-^fe)' hx\>K9) 



{L{Nk),Q)-{fi,Q) + {L{Nu),Q)) - {mk),Q) 
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and therefore have 

P{\{L{JVk)J)~{fiJ)\>e) < P{{L{Mk),lM>K\g\))>e/6) 

+P(|(L(A4),Q>-(/^,Q>| >e/6) 
+P{\{L{Afk),Q)) - {LiAfk),Q)\ > e/6) 
:= Pi+Pa+Ps- 

We have Pi ^ by Lemma IfTTI We have P2 ^ by Lemma I^THl We have P3 
by hmit formula (|20|l . Therefore (|27|l does indeed hold. 

We finally tm-n to proving the convergence ^ /i. The proof of Lemma f2. 31 
shows that the analogue 

(28) im, = EM^K„+^^,(^^)^j^{w) 

w 

of formula l|26|) holds for any nonempty finite set of letters M, where the random 
variables Mj\f{'w) and Hp^^{w) are defined by mimicking the definitions of M{w) 
and Hp{w), only this time using a letter-indexed family {K^(a)} of color-valued 
family i.i.d. random variables with common law 9j\f. Note that Mj\f{w) and Hp^j\f{w) 
are uniformly bounded in J\f. Clearly for each Wigner word w and nonnegative 
integer p we have convergence in distribution Mfj-^Hp^fj-^{w) — > MHp{w), which 
extends to the convergence of expectations by bounded convergence. The sum in 
(123 being over a finite number of terms, it follows that {^.jsf^^x^) {fi,x") for 
all n, and in turn that /i since the measures in play here have uniformly 

bounded supports. The proof of Theorem 13. 21 is complete. □ 

7. The Furedi-Komlos circle of ideas 

In this section, we describe a (rough) technique which allows us to bound traces 
of polynomials of our random matrices when the degree of the polynomial is allowed 
to grow with the dimension of the matrix. The approach we take is inspired by the 
work of Fiiredi and Komlos [FK81) . We mention in passing that for Wigner matrices 
all of whose entries have even distributions, much more detailed information is 
available in [SS98|. 

7.1. FK sentences. Let a — be a sentence of n words. We say that a is 
an FK sentence under the following conditions: 

• Ga is a tree. 

• Jointly the words/walks Wi visit no edge of Ga more than twice. 

• For i = 1, . . . , n — 1, the first letter of Wi+i belongs to [Sj=i snppwj. 

We say that a is an FK word if n = 1 . Any word admitting interpretation as a walk 
on a forest visiting no edge of the forest more than twice is automatically an FK 
word. The constituent words of an FK sentence are FK words. If an FK sentence 
is at least two words long, then the result of dropping the last word is again an FK 
sentence. If the last word of an FK sentence is at least two letters long, then the 
result of dropping the last letter of the last word is again an FK sentence. 

7.2. The graph G\ associated to a sentence. Given an n-word-long sentence 
a = [wiJ-Li, define = {V^tEI) to be the subgraph of Ga = (14, -Ea) with 
Va = Va and equal to the set of edges e € Ea such that the words/walks Wi 
jointly visit e exactly once. 
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Figure 2. The graphs Gu 
w = 12131454 



(left) and G^, (right) for the FK word 



Proposition 7.3. Let w be an FK word. There is exactly one way to write w — 
wi ■ ■ ■ Wr where the words Wi are pairwise disjoint Wigner words. 

In this situation, denoting by the first letter of Wi , we declare the word ai - ■ ■ 
to be the acronym of the FK word w. 

Proof. The only possible decomposition w — wi ■ ■ ■ Wr the desired type is the 
one with breaks at the edges of G\,. Since the transition from Wi-i to Wi is along 
an edge of the tree Gw never again visited by w, the words Wi must be pairwise 
disjoint. Since every edge of Gw visited by Wi is visited exactly twice by w, and 
the Wi are pairwise disjoint, in fact Wi visits every edge of Gw either twice or never, 
hence by the parity principle Wi is closed, and hence Wi is a Wigner word. □ 

Lemma 7.4. There are at most 2"^^ equivalence classes of FK words of length n. 

Proof. From the recursion H12|l (see also ijl2.3.3l below) it is easy to deduce that 
the sum of terms i^*^*"' extended over a cross-section of the set of Wigner words is 



^ ' 2t 

Via the preceding lemma, it follows that the sum of terms t^'^'^^ extended over a 
cross-section of the set of FK words is 



Ht) , 1 l + 2t 



l-^{t) ' 2 VI - 4t2 

whence the claimed bound. □ 

7.5. FK syllabification. Let w = be a word of length n. Roughly speak- 

ing, we wish to define a parsing of w into an FK sentence by going sequentially over 
the letters in w and declaring a new word each time not doing so would prevent 
the sentence formed up to that point from being an FK sentence. More precisely, 
we define a sentence w', which we call the FK syllabification of w, by the following 
procedure. We declare an edge e of Gw to be new (relative to w) if for some index 
1 < i < n we have e = {ai, Ofi+i} and ai+i ^ {ai, . . . , 0;^}, and otherwise we declare 
e to be old. We define w' to be the sentence obtained by breaking w at all visits 
to old edges of Gw and at third and subsequent visits to new edges of Gw- For 
example, temporarily spelling with the alphabet {1,2,3}, the FK syllabification of 
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w — 1231 is the sentence w' — [123, 1] consisting of two words; the FK syUabifica- 
tion process has to "insert a comma" between 3 and 1 because 1231 is not an FK 
word, whereas 1, 12 and 123 are. It is clear that Gw' is a spanning tree in Gw, that 
w' is an FK sentence, and that w is the concatenation of the constituent words of 
w' . Moreover, we have w = w' if and only if w is an FK word. Clearly the FK 
syllabification process preserves equivalence, i. e., w ^ x ^ w' ^ x' . 

Lemma 7.6. Let a = [w^]"^]^ be a sentence of n > 2 words. Put b — [wi]"Zi and 
c = Wn- Assume that b is an FK sentence, that c is an FK word, and that the 
first letter of c belongs to supp6. Let 71 • • ■7r be the acronym of c spelled out in 
full. (Note that by hypothesis 71 G suppb.j Let £ be the largest index such that 
7f e supp b and write d = 71 • • • 7^ . The following conditions are both necessary 
and sufficient for a to be an FK sentence: 

• d is a geodesic in the forest G\ . 

• supp b n supp c = supp d. 

Consequently there exist at most (wt b)^ equivalence classes of FK sentences 
[^iliLi such that b ^ [xiYlZ^ and c ^ Xn- See Figure 3 for an example of two such 
equivalence classes and their pictorial description. 




Figure 3. Two inequivalent FK sentences [xi,X2] corresponding 
to & = 141252363 (solid) and c = 1712 - 3732 (dashed). 

Proof. Sufficiency is easy to check. We omit the details. We turn to the proof of 
necessity. To begin with, since Ga is a tree, d is the unique geodesic in Gc C Ga 
joining 71 to 7^, and hence is also the unique geodesic in Gb C Ga joining 71 to 
7f. Now d only visits edges of Gh already visited by the constituent words of b. 
Therefore we have Ed <Z El, \. e., d \s a. walk in G\. By Proposition 17.31 we have 
El = Ej-^...^^. By definition of an FK sentence we have Eb (1 Ec C El f] E].. It 
follows that Eb n Ec = Ed- Finally, we have 

#K = 1 + #Sa = 1 + #^6 + 1 + #Sc - 1 - = Wb + Wc - Wd, 

and hence, since #Vfc + #14 — ifVb HVc — #K, the inclusion C Vf, n Vc is in fact 
an equality. □ 

Lemma 7.7. Let T{k,£,m) denote the set of equivalence classes of FK sentences 
a = [wi]l^i consisting ofra words such that ^{wi) = £ and wta = k. We have 

#r(/c,^,m) < 2^-" f \ )fc2(™-i). 
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Proof. There are exactly ^--tuples of positive integers summing to I and 

hence by Lemma f7.4l there are at most 2^~™ ways to prescribe equivalence 

classes of FK words wi, . . . , Wm subject to the constraint Y^=i K^i) — ^- Now fix 
FK words zwi, . . . , Wm such that X^ZLi ^(^i) = ^- By Lemma l7.6l there exist at most 
^2(m-i) equivalence classes of FK sentences b = [a;i]™ ^ such k — wtb and Wi ~ Xi 
for i ~ 1. . . . , m. The result follows. □ 

Lemma 7.8. For any FK sentence a — [wi]^^ consisting of m words we have 

m 

(29) m^#El-2wta + 2 + ^e{w^). 

1=1 

Proof. Put M :— ^(^i)- Consider the word [Q;i]f£i obtained by concatenating 

the words of the sentence a. Consider the hst A = [{a^, Oi+i}]^^^ of unordered 
pairs of letters. Among the entries of A we find 2#_Ba — of them that are 

edges of Ga, while the rest correspond to the m — 1 "commas" in the sentence a; 
and moreover, since Ga is a tree, we have #i?Q = wta — 1. The result follows. □ 

Proposition 7.9. For all positive integers n, k satisfying n > 2fc — 2 there are at 
most 

(30) NFK{n, k) 2"n3("-2fe+2) 

equivalence classes of weak Wigner words w such that £(w) = n + I and wtw = k. 

This is a crude but easy-to-apply version of the estimate one obtains by exploiting 
the idea of "coding" introduced by Fiiredi and Komlos in |FK81| . 

Proof. Let w be a weak Wigner word. Let w' be the FK syllabification of w. Let m 
be the number of words in the sentence w' . We must have E^^, — lest there exist 
an edge of Gw visited only once by w and so we must have m = £{w) — 2'wtw + 2 
by the preceding lemma. Therefore ^T{k,n + l,7i — 2fc + 3) bounds the quantity 
we wish to estimate, whence the desired result by Lemma l7.7l after a short further 
calculation which we omit. □ 



7.10. Companion estimate. To exploit the preceding proposition we need also to 
bound E\£^{w)\ for all weak Wigner words w such that £{w) = n + 1 and k = wtw. 
Fix such a word w now. We claim that 

(31) E\£,{w)\ < C(3(n + 2-2fc)) •C(2)"/2, with C{q) := 1 V supmax£;|Cr„ «i T. 

Consider the graph Gw ~ {Vw, E^) and let £ be the number of edges of E^ visited 
exactly twice by w. We have by H14(l and the Holder inequality that 

E\S,{w)\ < G{n-2£)G{2f. 

We have ij^Ey, > - 1 = fc - 1 since G is connected, n > 3 • (#Su, -£) + 2£ by 
counting, and hence 

n~2£ <'i{n + 2- 2k). 

The desired estimate now follows since G{q) is a nondecreasing function of q 
bounded below by 1. 
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8. Bracelets, polarizations and enumeration 



We have already seen in Section [3 that hmiting variances are determined by the 
enumeration of CLT word-pairs. In the current section, we study the structure of 
such word-pairs and their associated graphs. These turn out to be classified by 
certain "bracelets with pendant trees" . 

8.1. Graph-theoretical definitions. 

8.1.1. Bracelets. We say that a graph G — {V,E) is a bracelet if there exists an 
enumeration ai, . . . , of such that 



and so on. We call r the circuit length of the bracelet G. 

8.1.2. Unicyclic graphs. We say that a graph G — (V,E) is unicyclic if G is con- 
nected and = ifE. In other words, a unicyclic graph is a connected graph with 
one too many edges to be a tree. Any bracelet of circuit length ^ 2 is unicyclic. 
However, a bracelet of circuit length 2 is a tree. 

Proposition 8.2. Let G ~ {V, E) be a unicyclic graph. For each edge e Cz E put 
G\e — (V, E \ {e}). Let Z be the subgraph of G consisting of all e ^ E such that 
G\e is connected, along with all attached vertices. Let r be the number of edges of 
Z . Let F be the graph obtained from G by deleting all edges of Z . The following 
statements hold: 

(1) F is a forest with exactly r connected components. 

(2) If G has a degenerate edge, then r — \. 

(3) If G has no degenerate edge, then r > 3. 

(4) Z meets each connected component of F in exactly one vertex. 

(5) Z is a bracelet of circuit length r. 

(6) For all e G E the following conditions are equivalent: 

(a) G\e is connected. 

(b) G\e is a tree. 

(c) G\e is a forest. 

We call Z the bracelet of G. We call r the circuit length of G, and each of the 
components of F we call a pendant tree. 

Proof. The proposition is well-known in principle. We just explain how to prove 
statement 5 and omit the remaining details. Pick an edge e = {a, /?} of G so that 
G \ e is a spanning tree. Then e is an edge of Z, and it is not difficult to verify that 
the edges of Z distinct from e are the edges of the tree G\e visited by the unique 
geodesic in G \ e joining a to /3. So it is clear that Z is a bracelet. □ 

8.3. The bracelet of a CLT word-pair. Fix a CLT word-pair [u',2:]. Let G — 
G[w,x] be the associated graph. 




{{ai,ai}} 
{{ai,a2}} 
{{ai, 012}, {a2, as}, {as, ai}} 
{{ai, 02}, {^2, as}, {as, 04}, {a^, ai}} 



if r = 1, 
if r = 2, 
if r = 3, 
if r = 4, 
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8.3.1. By Proposition 14.121 either G is unicyclic or G is a tree. If G is unicyclic, 
we define the bracelet, circuit length, and pendant trees of [w, x\ to be the same 
as those defined for G by Proposition 18.21 Suppose now that G is a tree. Then 
there exists by Proposition 14. 12f Si fc) a unique edge of G visited exactly twice by 
w and twice by x; this edge and attached vertices we declare to be the bracelet 
of [uijO;], and we declare the circuit length of [w.x] to be the circuit length of its 
bracelet, namely 2. Erasing the edge of the bracelet from G leaves a forest of two 
components; as before, we call the components pendant trees of [wjx]. Note that 
in all cases the circuit length of [w,x\ depends only on the equivalence class of the 
word-pair [w, x]. 

8.3.2. Let Z and r be the bracelet and circuit length of [w,a;], respectively. Note 
that G is unicyclic or a tree according to whether r 2 or r = 2. 




Figure 4. The bracelet 1234 of circuit length 4, and the pendant 
trees, associated with the CLT word-pair [12565752341,2383412] 



8.3.3. Now write w = [ail^li and x = [(3j]j^l- Let w and x be the words ob- 
tained by dropping the last letters of w and x, respectively. Let a be any cyclic 
permutation of {1, ... , £{w)} and let r be any cyclic permutation of {1, ... , £{x)}. 
Then [w"' aa(i) , x'^ f3r(i)] is again a CLT word-pair with associated graph, bracelet 
and circuit length the same as for (The "exponential notation" used here 

was defined in ii4.3.5l l We declare the ordered pair (cr, r) to be a polarization of 
[w,x] if the last edge of G visited by the walk i&'^acr(i) equals the last edge of G 
visited by the walk x^ ^T^iy Note that the set of polarizations of a CLT word-pair 
depends only on its equivalence class. The notions of bracelet and polarization are 
linked by the following result. 

Lemma 8.4. Let [w,x\ he a CLT word-pair. Put G = G[u,,a;]. Let Z and r denote 
the bracelet and circuit length of [w,x], respectively. Let e be an edge of G. Then: 
(i) e is an edge of Z if and only if both words/walks w and x visit e. (ii) Unless 
r = 2, there exist exactly r polarizations of [vu, x]; but if r = 2, there exist exactly 4 
polarizations of[w,x]. 

In the example of Figure 4, the four polarizations lead to the CLT word-pairs 

[12565752341, 1238341] , [25657523412, 2383412] , 
[34125657523,3834123] , [41256575234,4123834] . 
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Proof. If G is a tree, part (i) of the lemma holds by definition of Z , while part (ii) 
is a consequence of Proposition l4. 12r 3^ . So assume for the rest of the proof that G 
is unicyclic. By Proposition 14. 1 2( 4) (h) each edge of G is visited a total of exactly 
two times by w and x, and so part (ii) of the proposition follows immediately from 
part (i). We have only to prove part (i). (=>) The graph G\e obtained by deleting 
e is by hypothesis a tree. If one of the words/walks w or x fails to visit e, say the 
former, then by the parity principle w must visit every edge of G an even number 
of times. But then, due to Proposition l4. 12r 4) (h) . it is impossible for w to visit any 
edge of G visited by x, which is a contradiction. (<^=) By hypothesis and Proposition 
I4.12r 4')fb) the walk w visits e exactly once, hence some cyclic permutation of w is 
a walk on G \ e the set of endpoints of which equals e, hence G \ e is connected, 
and hence e is an edge of Z. □ 

Lemma 8.5. Let G be a forest. Let w and x be words both admitting interpretation 
as walks on G. Assume that jointly w and x visit every edge of G either exactly 
twice or never. (Necessarily then, both w and x are FK words.) Assume further 
that w and x have at least one letter in common. Then exactly one of the following 
conditions holds: 

(1) w and X have acronyms which are either equal or mirror images, but have 
no letters in common apart from those shared by their acronyms. 

(2) w and x arc Wigner words with exactly one letter in common, but this 
common letter does not appear as the first letter of both words. 

Proof. Since any subgraph of a forest is again a forest, we may assume without loss 
of generality that 

G^{v,E) = (K,ui4,i;„u£;,). 

Since w and x have at least one letter in common, in fact G is a tree. Let and 
X* be the acronyms of w and x, respectively. Note that (resp., x,) is the unique 
geodesic in G with the same initial and terminal vertices as w (resp., x). By the 
parity principle and the hypotheses we have 

{e S E\w visits e exactly once} 
= {e S E\wif visits e} = f] E^ — {e £ £'|x* visits e} 
— {e & E\x visits e exactly once}, 

hence and words of the same length, say and we have 

£ = 1 + ^E^ n E,. 

If ^ > 1, then the words it;, and must either be equal or mirror images of each 
other. If £ = 1, then w and x are Wigner words since each visits every edge of G 
either exactly twice or never, but note that we need not in this case have equality 
of ui* and x,. Finally, since G, Gw and Gx are trees, we have 

W^i + *E = i + + - ^E^ nEx^ #K, + #K - e, 

which finishes the proof. □ 

Proposition 8.6. Fix closed words w and x each of length > 2. Put k — £{w) and 
£ — £{x). Let a (resp., t) be a cyclic permutation of {1, . . . , fc} (resp., {1, ...,£}). 
The following statements are equivalent: 

(1) [ui, x] is a CLT word-pair of which (cr, r) is a polarization. 

(2) w'^ and x'^ are FK words with acronyms either equal or mirror images, and 
with no letters in common apart from those shared by their acronyms. 
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We remark that under the equivalent conditions above, the common length of the 
acronyms of w'^ and x"^ equals the circuit length of 

Proof. The implication 2=>1 is easy to check. We omit the details. We turn directly 
to the proof of the implication 1=>2. Write w = [a^jjil^^^ and x — [fijYjt^. Let e be 
the last edge of G = G^^^^^] visited by the walks w'^acy{i) and x'^Pt{i)- Note that e 
by Lemma [8.41 is automatically an edge of the bracelet of [wjx]. Unless r = 2, let 
G' be the graph obtained by deleting e from G, but if ?- = 2 put G" ~ G. Then 
in all cases G' is a tree, and the words w'^ and x'^ are walks on G' satisfying the 
hypotheses of Lemma 18.51 Were w°' and x'^ to be Wigner words with exactly one 
letter in common not appearing as the first letter of both words, the graph G would 
have two degenerate edges, which by Proposition 14. 1 2l is impossible. □ 

8.7. Enumeration of CLT word-pairs by Wigner words. We are now ready 
to state an enumeration formula for CLT word-pairs similar to the enumeration 
formulas H12|l and albeit rather more complicated. 

8.7.1. Enumerative apparatus. Let [7^]^]^ be a sequence of distinct letters. For each 
positive integer i choose cross sections Ui and Vi of the set of Wigner words. Make 
these choices so as to achieve the following conditions: 

• For all i, every word belonging to Ui U Vi begins with 7^, but no word be- 
longing to Ui has a letter other than 7^ in common with any word belonging 
to Vi. 

• For all distinct i and j, every word belonging to Ui U Vi is disjoint from 
every word belonging to Uj U Vj . 

Let (y9 be a real- valued function defined for all sentences. Assume that ip{a) depends 
only on the equivalence class of a and vanishes when the sum of the lengths of the 
constituent words of a is sufficiently large, in which case the support of ip consists 
of only finitely many equivalence classes of sentences. 



8.7.2. Enumeration of CLT word-pairs. We have 

a 



(32) 



where: 



EE--E E--E EE 

¥'(Kao-(i)^i'^/3r(i)]) if r = 1, 

(p(Ka^(i),'y^/3^(i)]) + v?(Ka^(i),i)^/3r(i)])) /4 if r = 2, 

'/'(Ka<T(i),'f^/3r(i)]) + '/'(Ka<T(i),'i'^/?T(i)])) /r if r > 3, 



a ranges over any cross-section of the set of CLT word-pairs; 



• U = Ui---Ur^ [<^i\i=i] 

• V ^ vi - ■ -Vr = and v = Vr ■ ■ - vi = [A] -i^i ; 

• cr ranges over cyclic permutations of {1, . . . , ^(u)}; and 

• r ranges over cyclic permutations of {1, . . . ,^(ti)}. 
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One verifies that there is neither under- nor over-counting by applying Proposi- 
tion (which gives the structure of FK words) and Proposition 18. 61 fwhich gives 
the structure of CLT word-pairs) in a straightforward way. We omit further details. 



9. Proof of Theorem 13.31 
9.1. Further generating functions. Fix a sentence 



consisting of n words. 

9.1.1. Let t — [ijf^i be an n-tuple of independent (algebraic) variables and put 

n 

p i=l 

where p = [pi]f^i ranges over n-tuples of (nonnegative) integers. We view H{a,t) 
as a formal power series in ii, . . . ,t„ with random variable coefhcients, not as an 
analytic function of t. In other words, H(a, t) is just a device for manipulating the 
infinite array [Hp{a)] of random variables. We write 

MH{a, t) = M{a)H{a, t), MH{a, t) = M{a)H(a, t) 

in order to abbreviate notation. 

9.1.2. Unraveling the definition of H{-, •) in the case of a single word w = [q;j]^|^2\ 
we find that 

i{w) t{w) ^ 



(33) Hiw, t) = t^(-) J2 n lDi<a,))tp = H - 



IT j = l 



L 1 - tDi^a,)) 



where tt = [tTjI^^^'' ranges over £(?x;)-tuples of nonnegative integers. From (|33|l . it 
follows that 

■ 2 iiw) ^ 



l-tD{K{a,))J fj^ 1 - tD{K{a,)) 
T a 



dt 1 - tD{K{ai)) j ' n 1 _ tD{K{aj)) ' 

j — 2 

Taking the sum over all cyclic permutations a of {1, . . . ,^{w)}^ and arguing simi- 
larly, we find that 

(34) ^ff(u;^a,(i),t) = i2^i/(7«,t). 
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9.1.3. Returning now to the general situation, from 1)33(1 we get the identity 

(35) Hia, t) = n n i-^K..^))^. - n . 

From (|34|l and (|35|l we get the differentiation formula 
i d 2 ^ 



(36) J- • • • 47^i?(«, = E • • ■ E i^(K'«...(i)]r=i, 0, 



where in the sum Ui ranges over cyclic permutations of {1, ... , £(0^)}. We emphasize 
that these identities are to be interpreted formally, i. e., all the expressions are to 
be expanded as power series in ii, . . . , i„ in evident fashion and then coefficients of 
like monomials in the ti are to be equated. 

9.1.4. For each Wigner word w we define 

00 

$W(c,t) = ^ (c)i^('")+f. 

As with the generating functions introduced above, this, too, is to be viewed as 
formal power series in t. By Lemma 16.41 we have 

(37) E{MH{w,t)\n{a)) ^<^^'^\n{a),t) a.s. 

where to make sense of formula, both sides are expanded in powers of t, the integrals 
on the left are computed term by term, and then coefficients of like powers of t are 
to be set equal a.s. By Lemma [6.31 we have 

(38) <i>(c,i)=^$('")(c,t) 

■w 

where w ranges over a cross-section of the set of Wigner words. Note that in the 
sum on the right, for every fixed degree n, there are only finitely many terms in 
which the coefficient of is nonvanishing. 

Lemma 9.2. We have an identity 

^ c>o oof \ 

(39) E E ^^^^^ ■ ^'y' = (^2e(x, y) + ^{x, y) j 

of formal power series, where is the Gaussian family defined in Lemma \5.7\ 

Proof. Let [7i, J/j, V^]^]^ be the enumerative apparatus introduced in ^8.71 In an- 
ticipation of applying enumeration formula ((32|l we temporarily "freeze" data spec- 
ifying a single term on the right side of that formula: 

• Let r be a positive integer. 

• Let ui G Ui, . . . ,Ur £ Ur and vi G Vi, . . . ,Vr G Vr- 

• Let u = = ui ■ ■ ■ Ur- 

• Let V = be equal either to vi ■ ■ -Vr or to • • • fi. 

• Let (T be a cyclic permutation of {1, ... , i{u)}. 

• Let r be a cyclic permutation of {1, ... , i{v)}. 
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By direct appeal to the definitions we have 

(40) _ _ 

M{[uai,vl3i]) = M(Ka,(i),i;"/3,(i)]) 

r if r = f, 

= l[{M{u,)M{v,)) ■ I sW(K(7i),«;(72))-s(2)(^(7i),'^(72))' if r = 2, 

i=l [ Kr{K{'-fi),...,K{'-fr)) if r > 3. 

To understand this formula, notice that the right side is a product of factors asso- 
ciated to pendant trees times a factor arising from the bracelet. Put 

^ = ^(W7.)k=i)- 
We then have the following identities: 

J2 E (K«.(i) , [x, y])\J^) 

a T 

= E(M{[uai,vl3,])Hiu^a^^r),x)H{v^Pr^,j,y)\T) 

a T 

= x^—y^—E(Mi[uai,v(3i])H{u, x)H{v, y)\J') 

(41) "y 

1=1 y i=l 

d(2)(K(7i)) ifr = l, 

sW(^(7i), «^(72)) - s(^Hk{j,), ^(72))2 if r = 2, 

Kr{K{ji),...,K{-fr)) if r > 3. 

Here all the conditional expectations are to be calculated by expanding formally in 
powers of x and y and then integrating term by term; in the same spirit the equal 
signs are to be interpreted as a.s. equality term by term between formal power 
series. The preceding holds at the second equality by the differentiation formula 
H36|l . and at the third equality by (|37|l . Now take expectations (again, integrating 
term by term), and then apply identity (|38|l and enumeration formula l|32|) to find 
that 

(42) MH{[u, v], [x, y]) = x^g^y^^ heix, y) + ^{x, y) 

[u,v] 

where on the left \u v] ranges over a cross-section of the set of CLT word-pairs. 
The result now follows by definition of the random variables 1^. □ 

9.3. End of the proof of Theorem 13.31 By limit formula ()20|l . Lemma [5.71 and 
Lemma 19.21 we have for every nonnegative integer n that 

where 

oo 
i=l 

So the method of moments gives the result. □ 
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10. Proof of Theorem 13.41 

Because of the strong similarity between the proofs of Theorem 13.31 and Theo- 
rem E^nd because of the tedious nature of latter proof (the necessary enumera- 
tions are rather involved), we proceed quite rapidly, omitting many details. But we 
strive to provide all the important "landmarks" so that the reader won't get lost. 

10.1. Further random variables indexed by sentences. We enlarge the supply 
of random variables introduced in ^5.11 as follows. 

10.1.1. Given any finite nonempty set M of letters, let {nj\f{a)} be a letter-indexed 
color-valued family of i.i.d. random variables with common law 9j^. Then, with 

as above, for any word w and integer p we define Mj^{'w) and Hp^j\f{w) by repeating 
the definitions of M{w) and Hp{w), see ti5. 1.41 and ij5.1.6l with in place of 9. In 
fact, these random variables were already considered in the course of the proof of 
Theorem 13.21 see equation (I28|) . 

10.1.2. Given distinct letters a and f3, let [(3 i— > a] be the unique map of letter 
space to itself sending f3 to a but fixing all other letters. Given also a word w, let 
[P a\^w be the word obtained by applying [/? i— > a] letter by letter to w. 

10.1.3. Let ip be any map of letter space to itself. Let w — be any closed 
word. Put 

r if iy{e) = 1 

M(w,7/')= Yi { s^''^''"(«(V'(a)),K(V'(/3))) if !^(e) > 1 and a 7^ /3, 
e={a,/3}, [ d(''('=))(K(V'(a))) if i/(e) > 1 and a = /3, 

edge of Gill 

where iy{e) is the number of visits made by w to e. Note that if ip is the identity 
map, then M{w,ip) = M{w). The only case of the generalization M{-,-) of Af(-) 
figuring in our limit formulas is that in which w is a Wigner word and ip ^ [l3 t-^ a] 
for some distinct letters a and /3 appearing in w. Note that in that case M{w, ip) 
depends only on s^^\ not on {s^'''>}k^2 U {d^''^}. 

10.2. Approximation of {L{Af), x") at CLT scale. Fix a positive integer n. Let 

be as in Assumption 13.11 Starting again with formula lfT7|l. it is possible 
to obtain the formula 

£rn Nk ■ I {L{Afk),x'') - EMj^,H^+,^,(^^y^^ {w) J 

\ w I 

(43) = -\Y. ^M(w,[/3^a])i/„+i_,(„)([/3^a],w) 

[u,a,/3] 

-H^£;MiJ„+l_f(„)(v) 

V 

where: 

• w ranges over a cross-section of the set of Wigner words; 

• [u, a, /3] ranges over a cross-section of the set of marked Wigner words; and 

• V ranges over a cross-section of the set of critical weak Wigner words. 
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Note that only finitely many nonzero terms appear in the sums. Since the proof of 
(|43|l is quite similar to that of H18(l . if rather more complicated, we omit the details. 
We only give the following hint to the reader. Let us return for a moment to the 
set up of t|5.4.2l We have 

iV(5(AA, w) - EAWH,,+,_fM,M) - E fMPi), . . . , i^oiPr)) 

(/3i,...,/3,)GA/-'- 
#{/3i,...,/3,}<r 

and up to an 0{N^^) error the right side equals 

-iV"'' /(Ko(/3[jWi](l)),---,Ko(/3[jW^](r))) 

l<i<j<r ((}i,...,Pr)eAr-^ 

where [j i— > i] denotes the map of {1, ... ,r} to itself sending j to i and fixing all 
other elements. In the case that color space consists of a single color, the preceding 
remark boils down to the observation that 



N{N - 1) • • • (iV - r + 1) - iV^ = - 




where the omitted terms are 0{N^ ^). 

10.3. Enumeration of marked Wigner words by Wigner words. We use the 

enumerative apparatus introduced in t|8.7.1l We have 
(44) 

OC 

[w.a.P] r=l uieUi u^eUr viEVi v,.eVr a 

where: 

• [w, a, f3] ranges over any cross-section of the set of marked Wigner words; 

• u = ui - ■ ■ Ur{[ji 1-^ 7r+i]*wi)ur ■ ■ ■ V2 = [ctijl^i and 

• cr ranges over cyclic permutations of {1, . . . , £{u)}. 

Note that in this setting 

r 

(45) M(j/-a,(i), [7,+i ^ 7i]) = Y[{M{u,)M{vi}) ■ X.(/«(7i), • ■ ■ , ^ilr))- 

The intuition behind H44|l is as follows. Let [w, a, /3] be a marked Wigner word, 
write w = [ail^^i , ''^ result of dropping the last letter of w. After 

replacing w by i&'^ao.(i) for a certain uniquely determined cyclic permutation a of 
{1, . . . , £{w)}, we may assume that a is the first letter of w and that every appear- 
ance of a in i& precedes every appearance of (3. We may then view it; as a walk out 
and back on the geodesic connecting a to /3 in the tree G^, punctuated by sidetrips 
on the trees hanging from that geodesic. More precisely, an argument employing 
Proposition l4. 51 (which gives the structure of Wigner words), Proposition !?. 31 (which 
gives the structure of FK words) and Lemma shows that there is neither under- 
nor over-counting in 144() . We omit the details. 

10.4. The bracelet of a critical weak Wigner word. Let w — be a 
critical weak Wigner word. Put G — (V, E) — Gw — {Vyj,Eyj). 
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10.4.1. According to Proposition 14.81 either G is unicyclic or G is a tree. If G 
is unicyclic, then we define the bracelet and circuit length of w to be the same as 
defined for G in Proposition 18.21 If G is a tree, then there exists a unique edge e 
of G visited exactly 4 times by w; this edge and attached vertices we declare to be 
the bracelet of w, and we declare the circuit length of w to be that of its bracelet, 
namely 2. 

10.4.2. Let Z and r be the bracelet and circuit length of w, respectively. Note 
that r ^ 2 or r = 2 according to whether G is unicyclic or a tree. Note that in all 
cases the graph obtained from G by deleting the edges of Z is a forest with exactly 
r connected components each of which meets the bracelet in exactly one vertex; 
again we have a picture of "bracelet with pendant trees" . Note that in all cases w 
makes a total of 2r visits to edges of Z . Moreover the walk w visits each edge of 
the bracelet exactly twice, unless r = 2, in which case w visits the unique edge of 
the bracelet exactly 4 times. 

10.4.3. As in H8.3.3I let w be the result of dropping the last letter of w and let a 
be a cyclic permutation of {1, . . . Then w'^a^^i) is a critical weak Wigner 
word with graph, bracelet and circuit length the same as for w. We say that cr is a 
polarization of w if the last edge of G visited by the walk w'^a^^i^ is an edge of Z. 
Clearly: 

• There exist exactly 2r polarizations of w. 

Note that the set of polarizations of w depends only on the equivalence class of w. 

10.4.4. Suppose now that we are given a polarization a of w. We define the 
canonical decomposition 

= piP2 ■ ■■p2r-lP2r 

associated to a to be the unique decomposition with breaks at visits of the walk 
■w°' to edges of the bracelet. From the bracelet-and-pendant-trees picture it is 
not difficult to deduce that each pi is a Wigner word and that no two of the pi 
have letters in common with the exception that first letters may coincide. Let 
s = ai ■ ■ ■ OL-zr be the sequence of first letters of the pi. We call s the signature 
associated to the critical weak Wigner word w and its polarization a. Necessarily 
sai is a walk on the bracelet of w visiting every edge of the bracelet exactly twice 
unless r — 2, in which case sai visits the unique edge of the bracelet exactly 4 
times. Up to equivalence of words there are very few possibilities for s. In fact, the 
following possibilities are mutually exclusive and exhaustive: 

• r > 3 and s - 123 • • • rl23 • • • r. 

• r ~ 1 and s ^ 11. 

• r = 2 and s - 1212. 

• r > 3 and s'^ ^ 123 • • • rlr • • • 2 for some cyclic permutation r of {1, . . . , 2r}. 

In the first case we say that the signature is unidirectional, whereas in the remaining 
cases we say that the signature is backtracking. Notice that if s is unidirectional 
(resp., backtracking) for some polarization a, then s is unidirectional (resp., back- 
tracking) for all polarizations a. Thus it makes sense to say that w itself is either 
unidirectional or backtracking. 
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10.4.5. If w is backtracking, then for some polarization cr the associated signature 
is of the form 11 if r = 1, 1212 if r = 2, or 123 • • • rlr • • • 2 if r > 3, in which case 
we say that cr is a strong polarization of w. It is not difhcult to verify that: 

• If w is backtracking, there exist exactly 2 strong polarizations of w unless 
r = 2, in which case every polarization is strong (and so there exist exactly 
4 strong polarizations). 

Note that the set of strong polarizations of w depends only on the equivalence class 
of w. 

10.5. Enumeration of critical weak Wigner words by Wigner words. We 

again use the enumerative apparatus introduced in ^8.7.11 We have 



(46) EE'-'E E'-'E E^("'^"-(i))/ (2 ifr^2 

00 

+E E •■■ E E ■•■ E E^(-^/5^a))/2- 

where: 

• w ranges over any cross-section of the set of critical weak Wigner words; 

• u = Ui - ■ ■ UrViVr • • • W2 = Fdi=l i 

• a ranges over cyclic permutations of {1, ... , i{u)}] 

• V = xi - ■■ XrVi ■ --yr = ; and 

• T ranges over cyclic permutations of {1, ... , £{v)}. 

In this setting we have 

f d^'^^Ki-yi)) if r = 1, 

(47) M(u'^a,(i)) = [](M(u,)M(z;,)) J s^^H^^,) , k{j2)) if r = 2, 

i=l { Kr{K{'-fi),...,K{'-fr)) if r > 3, 

and we have an analogous expression for M(v'^ P^f^i-^). Formula (|46() may be derived 
from the preceding discussion of the bracelet of a critical weak Wigner word in a 
straightforward way. We omit the details. 

10.6. End of the proof. The left sides of (O and coincide by formula 
coming up in the proof of Theorem 13.21 So we can rewrite H43|) as an identity of 
formal power series 



(48) 



(lim TVfe • iimklx") - t"+' 

n=l ^ ' 



2 



where: 



[w, a, /3] ranges over a cross-section of the set of marked Wigner words; and 
u ranges over a cross-section of the set of critical weak Wigner words. 
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To finish the proof of the theorem we have just to make the right side of H48|l 
expUcit. This can be done by exploiting l|l5|l . and (|T7| . Note that many 
of the terms in the sum on [w^a, P] are cancelled by terms in the sum on u due 
to the parallel structure of formulas 145(1 and H47|l . We omit the remaining details 
of the proof because the calculations are very similar to those undertaken to prove 
Lemma [9.21 The proof of Theorem 13.41 is complete. □ 



11. Concentration 

In this section we work out sufficient conditions allowing one to prove a CLT 
for test functions more general than polynomials. Toward this end, we define for 
random matrices a notion of concentration and a notion of CLT for polynomial 
test functions. Then, assuming concentration, a polynomial-type CLT, and a fur- 
ther condition on the limiting covariance for polynomial test functions, we prove a 
CLT for continuously differentiable test functions with polynomial growth (Propo- 
sition ^HJ. Furthermore, we establish the concentration property for the matrices 
X{Mk) studied in Theorems 13.21 and YA.'M when the random variables C{a./3} satisfy 
the Poincare inequality with the same constant fProposition 111.'^ . The main re- 
sult of this section fTheorem II 1 . 10(l summarizes the preceding considerations in a 
fashion convenient for applications in ^121 

11.1. The concentration property. Throughout this section {Yk^'kLi denotes a 
sequence of random symmetric matrices. For such a general sequence we are going 
to define and study a concentration property. Eventually we are going to take 
Yfc — X{Afk), but in anticipation of applications of the concentration idea beyond 
the scope of this paper, we work in a general setting until the end of the proof of 
Proposition lll.6l For any Lipshitz function g : K" M set 

fAa\ II II , |g(^) -g(y)l 

(49) ||.9||Lip := sup . . , 

x^yGR" F ~ y\ 

where \x — y\ is the Euclidean distance between x and y. 

Definition 11.2. We say that the sequence of matrices {Yfej^x satisfies the con- 
centration property under the following conditions: 



There exists a constant c > such that for any Lipschitz function 
g : R R, it holds that sup;. Vartr ^(Yfe) < c||.g||Lip • 



There exists a compact set S" C M such that for any function 
(51) / : R — > M supported in S'^ of polynomial growth, it holds that 

E{[tYf{Yk)Y) ^fc_oo 0. 

The next lemma deduces from (|50|) and H51() a single statement convenient for 
applications: 

Lemma 11.3. Suppose that {Y^}^^^ satisfies the concentration property. Then 
there exists a constant c > and a compact interval T such that for any function 
f continuously differentiable on T and of polynomial growth one has 

limsup Vartr /(Yfc.) < csup |/'(a;)p. 

fe^oo xeT 
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Proof. Let S be as in H51(l . Choose a compact interval / with interior containing 
the set S, and then choose a compact interval T with interior containing /. Let 
g : M — > [0, 1] be a continuously diiTerentiable function identically equal to 1 on / 
and identically vanishing in the complement of T. Let £ be the length of T. Without 
loss of generality we may assume that / vanishes at some point of T. Then 

||/5l|Lip< (l + isnp\g'{t)\)snp\f{t)\, supp /(I - g) C 5^ 
\ teT / teT 

and 

[Vartr/(rfe)]i/^ < [Vartr(/5)(n)]^/' + (i?[tr(/(l - g)){Yk)ry^', 
whence the result by definition of the concentration property. □ 

11.4. CLT's for difTerentiable test functions. Our goal is to prove under suit- 
able hypotheses a central limit theorem for random variables of the form 

Zf^k ■.^tTfiYk)-Etrf{Yk) 

where / is continuously differentiable on a large enough compact set and of poly- 
nomial growth. 

Definition 11.5. We say that the sequence {Yk}'^i satisfies a polynomial-type 
CLT if there exists a mean zero Gaussian family {W„}^q of random variables such 
that for every polynomial function f{x) = X]fc=o ^^^^ holds that Zf,k converges in 
distribution as /c —> oo to Wf :— X]"=o '^i^i- 

The next proposition gives hypotheses under which one can extend a CLT statement 
from polynomial test functions to differentiable test functions of polynomial growth. 
After proving the proposition, verification of its hypotheses for Yk = X{Mk) under 
the assumptions of Theorems l3.2l and l3.3l along with further structural assumptions 
concerning the functions c?'-^-', s'^' and s'^' will be our task for the rest of the paper. 

Proposition 11.6. Assume that the sequence of matrices {^fej^x satisfies both the 
concentration property and a polynomial-type CLT. Assume further the existence of 
a sequence {(/nj^i of polynomial functions with the following properties: 

• For some compactly supported finite measure v onM. the sequence {qn}^=i 
is an orthonormal system in L^iv). 

• Every polynomial in x is a finite linear combination of the qn{x). 

• With qn{x) := qn{y)dy, the covariance matrix K{m,n) :— EWq^Wq^ of 
the mean zero Gaussian family {Wq^}^^i is diagonal. 

Fix T and c as in Lemma \ll.cH with T D suppi^. Then, for any function f of 
polynomial growth which is continuously differentiable on T, the random variables 
Zf^k converge in distribution to a mean zero Gaussian random variable Zf with 
variance 

(52) EZj = \\fTK<c sup \f'{t)\', 

where for any function h continuous on T we set 

oo 

\\h\\], ■.= Y,K{n,n){v, hq^f. 
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Proof. Consider at first the case in which / is a polynomial. The polynomial-type 
CLT implies that the variables Zj^k converge in distribution to Wf, and since / 
differs by a constant from a finite linear combination of the the variance EW'j 
takes the value asserted in (|52ll . namely • Furthermore, by Lemma Fl 1 . 31 and 

the Fatou Lemma, the estimate for H/'Ulf asserted in (|52|l holds. Thus all assertions 
are proved if / is a polynomial function. 

We turn to consideration of the general case. Let {Qm}m=i be a sequence of poly- 
nomials tending uniformly on T to /' (such is provided by the Stone- Weierstrass the- 
orem) and put Qyn{x) ■= Jq Qmiy)dy. Clearly, the sequence {Qm}m=i is II • Wh^i^)- 
Cauchy. But by (|^ the sequence {Qm}m=i is also || • H/f-Cauchy. A dominated 
convergence argument now shows that ||/'||^ = lim.,„^oo IIQmllx- 1^ follows that 
the estimate for ||/'||^ asserted in H52|l holds. By Lemma fl 1.31 the family of ran- 
dom variables Zf^k is tight; let Y be any subsequential limit-in-distribution. For 
any i e M one has 

\Ee''^ - Ee'^^^'^^l < limsupi;|e'*^/-«™.'= - 1| < \t\limsnp{EZ^ J^/^. 

k — ^oo k — >oo ' 

The quantity on the right by Lemma 1 1 1 . 31 tends to as m — > oo, and clearly 

Therefore (the characteristic function of) Y is (that of) a mean zero Gaussian 
random variable of variance ||/'||/^. Since all subsequential limits are the same we 
get convergence-in-distribution of Zf,k to a mean zero Gaussian random variable of 
variance ||/'||^. All assertions have been proved. □ 

11.7. Poincare inequalities for matrices. We say that a probability distribution 
on M satisfies a Poincare inequality if there exists a constant c,, such that for any 
/ : R ^ R smooth, it holds that 

Var,(/):= J [f{^)- j fi^Hdx)^ v{dx) < J |/'(x)pr;(dx) . 

For such a distribution rj one has 
/lY — EY\ \ 

(53) i?exp I ) — 0^ '■ random variable with law 77), 



12^ 

see |BU83I Theorem 2] (or |Bo99| for optimal constants). 

It is well known (see, e.g., jLeOl', Pg. 49]) that if rii,i = 1, . . . , K satisfy Poincare 
inequalities with constants c^. , then for any smooth function g : M.^ — > R, and with 
V — ^iLiVi f-iid c,, — maxfL^ c,,. , one has 

(54) Var^(g) =: J (^gix) - J g(x)f^[dx)^ ry(dx) < j \V g(x)\^T^(dx) . 

We recall (see e.g. jGZOOl Lemma 1.2]) that if / : R R is Lipschitz with Lipschitz 
constant ||/||Lip, then the function /jv : R^t^+i)/^ ^ R on A^-by-A^ symmetric 
matrices given by /at (A) — tr/(A) is Lipschitz with Lipschitz constant ||/Ar||Lip < 
V^||/||Lip- It follows that if X is an N-hy-N symmetric random matrix with on- 
or-above-diagonal entries independent and satisfying the Poincare inequality with 
the same constant c/N , then for any Lipshitz / : R — > R one has 



Vartr/(A)<c||/|| 



Lip ■ 
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See |CB04j for a systematic use of this fact, and jGZOO| for other concentration 
inequahties for random matrices. In particular, in the setting of Theorems 13.21 and 
13.31 if the random variables £,{a,p} satisfy the Poincare inequality with the same 
constant c, we see that (|50ll above holds true, with Yk — X{Mk)- 

Proposition 11.8. In the setting and under the hypotheses of Theorems \3.Sl and 

I.V./tI suppose that the random variables ^{a.p} satisfy the Poincare inequality with the 
same constant c. Then the sequence {X {Mk)}'^i has the concentration property. 

Before proving the proposition, we state an auxiliary estimate, which may be of 
interest in its own right. We get the estimate by combining the ideas of |FK81| as 
summarized in Proposition 17.91 above with the moment bound (|53|l . We remark in 
passing that under somewhat stronger assumptions, a considerably stronger asser- 
tion could be obtained by the methods of ISSMj . 

Lemma 11.9. Under the assumptions of Provosition Vll.lA there exist constants 
C > and e > such that with r{N) :— \_N'^\ one has 

(55) ^Eii X(A4)"'^'^^^ < 
for all sufficiently large k. 

Proof. By an obvious rescaling, we may assume without loss of generality that 
C(2) = 1, where C(2) is as defined in (|^ . We may further assume, so we claim, 

1/2 

that D = 0. To see that this is so, set {X{Mk))ai3 = C{q,/3} and suppose that 

the lemma holds with X{Mk) in place of X{Mk)- Then 

i?trX(A4)''-(^^) < A^fe|AU,ax(^(A/'fc))"'('^'=^ <A^fe[|AUax(^(A4)) + |i?|oo]"-^^^^ 

< 22'^(^'=)-iiV,[|AUa.(X(AA,.))2'-(^'=) + IZ?!^^^'-)] 

< 22'-(^'=)-i7Vfe[i?trX(A4)'''('^^^ + \D\^Z!''^^^ 

for all k large enough. The claim is proved. We assume for the rest of the proof 
that D = Q. By lfT7|) . for any positive integer n, 

ri+l 

(56) (L(AAfe),a;2") < ViVFK(2n,g)7Vr("+'' max E\i{b)\ 

beFK(2n,g) 

with FK(2n, q) denoting the collection of weak Wigner words of length 2n + 1 and 
weight and iVpK as in (|30|l . Note that 

max E\S,{h)\ < C(3(2n + 2- 2q)) 

fc6FK(2n,?) 

< supS (exp (|e(a, /3)|/12V^)) (1 V (12^^))3(2«+2-2■/) [3(2n + 2 - 29)]! 

a.f) 

< 2(1 V (12VS))=^(^"+2-29) [3(2n + 2 - 2q)]\ 2Co^(2n+2-29) ^^2^ ^^^y^ ^ 
where the first inequality is due to H31() and the second to (|53|l . Thus 

n+l 

(I(AAfc),a;2") < 2"+i^iVr^"+'^[3(2n + 2-2g)]!(Con)^('"+'"'''^ 

q=l 
n 
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as long as (6Con)^^/A^fc < 1/2. This completes the proof. □ 

Proof of Proposition \ll.f^ In view of Theorem 13.21 and the discussion in ijll.71 
it only remains to check (|51() . This is based on Lemma 111.91 Fix C as in the 
statement of that lemma. Define the compact set S = [— C — 1,C+ 1]. Suppose 
that \f{x)\ < ci\x\'^^ and / is supported on S'^. Then, using that 

{\x\/{C + l/2))''(^'=) > \x\^''^ for > C + 1 and k large, 

one has 

E{[tYf{X(Afk))]^) < NkEtvf\X{Nk)) 

< N,clEj2Ki^k?''^\x.if^.)\>iC+i) 

i=l 



2—1 ^ 



(57) < iVM(^^j -.^^0. 

□ 

By combining Propositions lll.fil and 111.81 we immediately get the following 
theorem, which is the main result of this section. Recall that under the as- 
sumptions of Theorem 13.31 the sequence {X{J\fk)}'^i satisfies a polynomial-type 
CLT, i. e., there exists a mean zero Gaussian family {M^njJ^o random vari- 
ables such that for every polynomial function f{x) = X^I^o '^i^* random vari- 
ables tr/(X(7Vfc)) — Eiv f{X{Mk)) converge in distribution as fc ^ oo to Wf :— 

Theorem 11.10. We work in the setting and under the hypotheses of Theorems 
li^y.H and \!i.!A We make the following further assumptions: 

• The random variables S,{a,(3} satisfy the Poincare inequality with the same 
constant c (and hence {X{J\fk)}'^^i has the concentration property). 

• There exists a sequence {(^nlJ^Li of polynomial functions with the following 
properties: 

— For some compactly supported finite measure v on R, the sequence 
{qn\^=\ is an orthonormal system in Lp'iy). 

— Every polynomial in x is a finite linear combination of the qn{x). 

— Withqn{x) :— Jq qn{y)dy, the covariance matrix K{m,n) :— EWq^Wq^ 
of the mean zero Gaussian family {W^g„}5^i is diagonal. 

Then there exists a compact interval T D supp v and a constant c > such that for 
any function f of polynomial growth which is continuously differentiable on T, the 
random variables 

Zf^k ■.^trf{X{J^k))~EtTf{X{Mk)) 
converge in distribution to a mean zero Gaussian random variable Zf with variance 



(58) EZj = V K{n, n){v, f'q^^ < c sup \f'{t)\' 
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12. DiAGONALIZATION BY ChEBYSHEV POLYNOMIALS 

We discuss two specializations of the band matrix model in which we can make /i, 
<i>(c, t), Q{x, y), ^'(a;, y), VarZ/ and Ef as appearing in Theorems 13.21 and \3.'A\ much 
more explicit and moreover apply Theorem 111.101 This will be possible because 
in these specializations (slight variants of) Chcbyshev polynomials diagonalize the 
covariance matrix of the limiting mean zero Gaussian random variables. 



12.1. Inversion of power series and p-Chebyshev polynomials. Our compu- 
tation involves the inversion of formal power series. Fix a sequence of real numbers 
{tti} and define the formal power series 

oo 

(59) p = p{t) ■.= t + Y^ Qif . 

i=2 

(For the proof of Theorem 13.51 concerning the generalized Wigner matrix model, it 
will be enough simply to take p{t) = <&(t), where $(i) is the generating function 
for the Catalan numbers defined in (|62(l below.) For each positive integer n, define 
the n*'' p-Chebyshev polynomial Tn^p{x) as the unique polynomial in x of degree n 
with real coefficients such that T„_p(l/t) is the principal part of the Laurent series 
p{t)~^ . Finally, define the matrix P with rows and columns indexed by the positive 
integers by setting Pij equal to the coefficient of P in p% i. e., 

(60) P„ :=Rest=o (^^Vy) , 

where for any sequence [ci]'^_^ of constants such that Ci = for « <C we set 

oo 

Rest^o dt := c_i . 

i— — oo 

Lemma 12.2. Fix p(t) as in 1^5 9\) with its associated p-Chebyshev polynomials 
Tn.p{x) and matrix P as in \6U\) . Identify power series in x without constant term in 
the obvious way with column vectors having entries indexed by the positive integers 
(thus identifying polynomials in x without constant term with finitely supported 
infinite column vectors). Then, the n*^ column of P^^ equals ^xT'^p{x). 

Proof. Let r — r{t) be the formal power series inverse otp(t), i. e., the unique power 
series without constant term such that 

p{r{t)) = r{p{t)) = t. 

By the Lagrange inversion formula, c.f. |St99l §5.4], 

The last expression is by definition exactly the coefficient of x' in ^xTjp{x). □ 



12.3. Chebyshev polynomials. 
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12.3.1. Definition. For each positive integer n we define the n*'' Chebyshev polyno- 
mial Tn (x) of the first kind to be the unique polynomial in x such that 

T„(z + 1/z) = z" + 1/z" (equivalently: r„(2cos6') = 2cosn6') 

and we define the n*'' Chebyshev polynomial Un{x) of the second kind by the rule 

Unix) := ^T^ix). 

We have orthogonality relations 

I /-a 

(61) / Um{x)Un{x)\fi^^ dx ^ 5mn (to, n = 1 , 2, 3, . . . ) 

as can be verified directly by the trigonometric substitution x = 2 cos 6. Note that 
the weight figuring in these orthogonality relations is the semicircle law as of mean 
and variance 1. These relations say that the family {J7n(a;)}^i is the Gram- 
Schmidt orthogonalization in L'^{as) of the family {x"~^}'^^i of powers of x. We 
have analogous orthogonality relations for Chebyshev polynomials of the first kind 
(with a different weight), but these we omit because we have no use for them. 

12.3.2. Warning. Our definitions are not quite the standard ones. One usually 
defines r„(cos6') — cosnO and Un{x) — -^^T'^_^^{x) . We have rescaled and re- 
indexed in order to obviate many annoying factors of 2 and shifts of 1. It is also 
worth pointing out that in our set up the polynomial xUn{x) is monic of degree n, 
and moreover even or odd according as n is even or odd. 

12.3.3. Reinterpretation of the Chebyshev polynomials. Consider the odd power se- 
ries 

(62) ^(t) ^^—^^^^^ ( ^" \e'^+^ = t+t^ + 2t'' + ht' + ■■■ 

n— ^ ' 

having the n*^ Catalan number as the coefficient of Clearly <I>(i) satisfies 

the functional equation 

l/t = $(i) + l/$(t) 
and hence more generally the functional equation 

(63) T„(1A) = $(i)" + l/$(t)" 

for all positive integers n. In other words, for each positive integer n, the n*'' 
Chebyshev polynomial Tn{x) (with its constant term dropped) may be reinterpreted 
as the n*^ $-Chebyshev polynomial Tn $(a;), in the sense of Lemma [12. 21 

12.3.4. Diagonalization formulas. In Lemma ll2.2l let us now take p{t) = $ ^ -^J^^ ^ 

where 7 is any real constant. (For the proof of Theorem 13 . 51 concerning the general- 
ized Wigner matrix model it will be enough to consider just the case 7 = 0.) Note 
that Tn.p{x) and T„(a; — 7) differ by a constant and hence ixT^ p(x) = xJ7„(x — 7). 
Using the obvious identification between power series in x without constant term 
and row vectors with entries indexed by the positive integers, one may think of p^{x) 
as efP, where P is the matrix from 1)61) |l and is the (infinite) column vector whose 
j*'' entry is 5ij. On the other hand, by the remark following H63|l . and Lemma [12. 21 
using the obvious identification between power series in x without constant term 
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and column vectors with entries indexed by the positive integers, one can identify 
xUn{x — 7) with P^^en- Hence, for any sequence {rjij^i of real constants, 



(64) {Il'^^^[T^t) ^'^^ 

and similarly 

(65) *(t^) ^''Um{x-j)yUniy-j)^ =VnSmn, 

for all positive integers m and n. 

12.4. First specialization: generalized Wigner matrices. 
12.4.1. Specialization of the model. As in Theorem 13. 51 assume now that 

£' = 0, / s^^\c,c')e{dc') = l. 



This specialization of the band matrix model we call the generalized Wigner matrix 
model. In the case that s^^^ = 1 this is more or less the standard Wigner matrix 
model, whence the terminology. 

12.4.2. Calculation of ^{c,t) and fj,. In the case at hand <I>(c, i) must be indepen- 
dent of c, and hence by functional equation Q we have $(c, t) = where the 
latter is as defined in (|62|l . From $(i) we can read the moments of /i. We conclude 
that /i is the semicircle law as of mean and variance 1. 

12.4.3. Calculation of <d{x,y) and \E'(a;,y). Because $(c, i) = $(i), the integrals 
figuring in the definitions of Q{x,y) and '^{x,y) greatly simplify. We find that 



(66) Six, y)^J2 >^rHxYHyr, ^{x, y)^Y. ^MxYHyY 

r—l r—1 

where 

Xr = - I ■ ■ ■ I Kr{ci,- ■ ■ ,Cr)0{dci) ■ ■ -OidCr), 



{d^^\c)-2s^^\c,c))e{dc) ifr = l, 

(sW(ci,c2) - 3s(^Hc,,C2Y)9{dc,)0{dc2) if r = 2, 
if r > 3. 



12.4.4. Calculation 0/ VarZ/ and Ef. Using the orthogonality relations (|61|l . for 
any polynomial function / we can write 

00 

f{x) = Y,Un{x)Ef'{S)Un{S) 

n=l 

where 5* is a random variable with standard semicircular law as- (Only finitely 
many nonzero terms appear in the sum.) Then, using the diagonalization formulas 
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(|65|l (with rji = Xi) and l|64(l (with 772^ ~ Xi and ?72j+i = 0), we can in the present 
speciaUzation of the band matrix model rewrite and in the form 

00 

(67) V&rZf = ^(2A, + e,)(^/'(5)C/,(5))2, 

00 ^ 

(68) i?; = ^-(A,+e,)£;/'(5)[/2.(5). 

12.5. Proof of Theorem 13.51 In view of Theorem 13.31 the discussion in ^12.41 
immediately above, and Theorem 111. 101 all that remains to be done is to take in 
the latter Theorem ~ as and qn{x) — Un{x). □ 

12.6. Second specialization: generalized Wishart matrices. 

12.6.1. Square-root generalized Wishart matrices. Assume that color space is de- 
composed as a disjoint union AU B and that 

< e{A) < 9{B). 

Put 

^-'M''^ 

Assume that s^^^ and s^^^ vanish identically on {A x A)U (B x B) and that 

A Jb 

Assume that D = 0. Assume that S''^ = for all A: > 0. This specialization of the 
band matrix model we call the generalized square-root Wishart matrix model. 

12.6.2. Generalized Wishart matrices. As we are about to see, the machinery we 
developed is well-suited to deal with square-root generalized Wishart matrices. In 
applications, however, one is often interested in a slight variant. Write 

= AA^ U AA^ 

where N'^ (resp., M^) is the set of letters in A/" with color in A (rcsp., B). As 
usual let N, and denote the corresponding cardinalities. By re-arranging 
coordinates, X{J\f) can be written in the form 

Y{U) 
Y^{M) 

where the matrix Y{M) has rows indexed by and columns indexed by . In 
this situation, we call the symmetric random matrices 

W{M) = Y{N)Y{Uf 

(rows and columns indexed by M^) generalized Wishart matrices^ and we are in- 
terested in the empirical distribution of their eigenvalues {XiiW {N))}^^i (all non- 
negative) : 



X{N) = 



44 



GREG ANDERSON AND OFER ZEITOUNI 



In the case that s'^^-' is constant on (A x B) U {B x A), the spectrum of W{J\f) is 
the same as that of standard Wishart matrices, whence the terminology. Note that 
for any function g{t) on M+ with 17(0) = 0, and setting g{t) = g{t^), one has 

(69) N{L{Af)rg) = tr~g{X{U)) = 2tr g{W{M)) - 2N^{Lw{^),9). 

Hence, once resuhs (either LLN or CLT) are derived for X{J\f), it is a simple 
exercise in book-keeping to transform them to statements about W{J\f). 

12.6.3. Calculation of^{c,t). Under our additional assumptions in H12.6.1I we can 
write 

$(-,i) = lA$A(t) + lB$B(t) 

where ^A{t) and ^B{t) are color-independent. Functional equation Q in the case 
at hand specializes to the functional equation 

which in turn can be rewritten as the pair of functional equations 

= t{i - ta<i>B{t)y\ $B(t) = t{i - tfi<i>A{t)y^. 

After a straightforward calculation with formal power series we find that 

$(c,i)0(dc) = eiA)<^>Ait) + e{B)^Bit) 



(70) 



where $(i) is as in H12.3.3I 

12.6.4. Calculation of fi. From (|7()|l we know the moments of the measure ^ and 
moreover we can compare these moments to those of the semicircle distribution. 
We find that 



(72, /) . f^m + ^ / 

V T 11^ J\x^~^\<2 \A 

Now jji is the weak limit in probability of the empirical distributions L{f\fk)- To 
calculate the corresponding hmit fiw of the empirical distributions Lwi-N'k) we use 
the "bookkeeping principle" H69|) to find that 

,73, (,„./, . r mvieeeji,,. 

See e.g. |PM67| for the latter result in the case of Wishart matrices. Note that if 
0{A) < 9{B) and hence 7 > 2, the measure /x has some mass concentrated at the 
origin. Notice also that if 7 = 2, then fi is the semicircle distribution. 
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12.6.5. Calculation ofQ{x,y) and "i'{x,y). With A2r and e2r as defined in H66|l . we 
have 

oo / 2\'' / 2\'' 

= |:-*(t^) *(t^) ■ 

To verify these formulas the main thing to note is that Kr{ci, . . . ,Cr) — unless 
colors along the sequence ci, . . . ,Cr,ci alternate between A and B. It follows in 
particular that Kr = for odd r. 

12.6.6. The measure v and associated orthogonal polynomials. Let v be the measure 
with density 

di' a/4 - (x^ - 7)2 

^^l^^-^l^^ 2^^ 

with respect to Lebesgue measure. Note that /i is a convex combination of v and a 
unit mass at the origin. Note that if 7 = 2 then — as- Put 

Vn{x) := xUnix'^ - 7)- 

By a straightforward calculation one verifies that the system of polynomial func- 
tions is orthonormal in L'^{v), and moreover forms the "odd part" of the 
family of orthogonal polynomials naturally associated to the weight v. By another 
straightforward calculation one verifies that for any continuously differentiable func- 
tion 5, setting g{x) — g{x'^), one has 

(74) {:.,{~gyV„)^2Eg'{S + j)U^{S), 
where 5' is a random variable with standard semicircular law as- 

12.6.7. Calculation ofYavZf and Ef- For any even polynomial function /, by the 
orthogonality relations noted above, we can write 

00 

/'(x) = ^K(a;) i'^J'V,.), 

n=l 

with only finitely many nonzero terms in the sum. Then, by formulas (|65|l and 
(I64II , we can in the present specialization of the band matrix model rewrite and 
(PJ in the form 

00 

(75) VarZ/ =^(2A2.-l-e2.)(^^,/V,)^ 

r=l 



(76) i?/-5]Tr(A2,. + e2.)(i^,/V2,.), 

at least when / is an even polynomial function. But then these formulas must 
remain valid for any polynomial / even or not since tr f{X{Afk)) vanishes identically 
for odd /. 
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12.6.8. Calculation of Yar Zw,g- Given a polynomial function vanishing at the ori- 
gin consider the random variables 

Zg,w,k := tr g{W{Mk)) - EtrgiWiMk)) 

where g is any continuously difFerentiable function of polynomial growth. By the 
bookkeeping principle Ht)9|) . with g{x) = g{x^), we have Zg w,k = \Zg,k-: hence when 
5 is a polynomial function, the random variables Zg w,k converge in distribution to 
a mean zero Gaussian random variable Zg w with variance 

oo 

(77) VarZ<,,v^ =^(2A2. + e2.)(S5'(5' + 7)C^«(^))', 

r=l 

where, as in formula (|74|l . S" is a random variable with law erg. An analogous 
evaluation of the shift in mean can also be provided, but to avoid repetitions, we 
do not state it here. 
Our final result is 

Theorem 12.7. We work in the setting and under the hypotheses of Theorems \'i.°A 
and \,°^.'A and in the specialization of the band matrix model discussed in H12.fA If 
the random variables C{q,/3} satisfy a Poincare inequality with the same constant 
c, then for any continuously differentiable function g with polynomial growth, the 
random variables Zg^w^k converge in distribution to a mean zero Gaussian random 
variable Zg^w, with variance once again given by \7T\ ) . 

Proof. Use Theorem ll 1.101 for the square-root generalized Wishart matrices X{J\fk), 
taking ly to be as defined in t ll2.6.6l {qn{x)}'^^i to be the Gram-Schmidt orthogo- 
nalization in L^{iy) of the family {x"'^^}^^i of powers of x, and f{x) — g{x^). □ 

13. Concluding remark 

We have chosen to concentrate in this paper on CLT's for symmetric matrices. 
Similar techniques work also with Hermitian matrices, the main difference being 
that with ^„,/3 = + id^fj = Cp^a^ when a ^ (3, and (^j^ and ^^^^ indepen- 
dent, identically distributed, zero mean real-valued random variables, it holds that 
E[^a,i3\'^ — 0, and hence in the combinatorial evaluation of the contribution of vari- 
ous terms in expansions similar to (|17|l . the contribution of words in which an edge 
is traversed twice in the same direction vanishes. (In particular, when computing 
variances for linear statistics of polynomial type, some of the bracelet contributions 
vanish.) None of the modifications needed to handle the Hermitian case are dif- 
ficult. However, there are sufficiently many such modifications needed so that to 
give a careful accounting of them would add a nontrivial number of pages to an 
already long paper. So we think it best to omit further discussion. 

Acknowledgments We owe the idea to look at spanning forests when proving Lemma 
14.101 to Victor Reiner. We also thank Sergey Bobkov for a useful discussion con- 
cerning Poincare inequalities. 
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